Parallel SIMD - A Policy Based Solution for Free Speed-Up using C++ Data-Parallel Types

SC21 Proceedings

Parallel SIMD - A Policy Based Solution for Free Speed-Up using C++ Data-Parallel Types

Workshop:ESPM2 2021: Sixth International Workshop on Extreme Scale Programming Models and Middleware

Authors: SRINIVAS YADAV (Keshav Memorial Institute of Technology, India; Louisiana State University, Center for Computation and Technology); Nikunj Gupta (University of Illinois); Auriane Reverdell (Swiss National Supercomputing Centre (CSCS)); and Hartmut Kaiser (Louisiana State University, Center for Computation and Technology)

Abstract: Recent additions to the C++ standard and ongoing standardization efforts aim to add data-parallel types to the C++ standard library. This enables the use of vectorization techniques in existing C++ codes without having to rely on the C++ compiler's abilities to auto-vectorize the code's execution. The integration of the existing parallel algorithms with these new data-parallel types opens up a new way of speeding up existing codes with minimal effort. Today, only very little implementation experience exists for potential data-parallel execution of the standard parallel algorithms. In this paper, we report on experiences and performance analysis results for our implementation of two new data-parallel execution policies usable with HPX's parallel algorithms module: simd and par_simd. We utilize the new experimental implementation of data-parallel types provided by recent versions of the GNU GCC and Clang C++ standard libraries. The benchmark results collected from artificial tests and real-world codes presented in this paper are very promising. Compared to sequenced execution, we report on speed-ups of more than three orders of magnitude when executed using the newly implemented data-parallel execution policy par_simd with HPX's parallel algorithms. We also report that our implementation is performance portable across different compute architectures (x64 -- Intel and AMD, and Arm), using different vectorization technologies (AVX2, AVX512, NEON64, and NEON128).

Back to ESPM2 2021: Sixth International Workshop on Extreme Scale Programming Models and Middleware Archive Listing

Back to Full Workshop Archive Listing