No Travel? No Problem.

Remote Participation
Case Study of Using Kokkos and SYCL as Performance-Portable Frameworks for MILC-DSLASH Benchmark on NVIDIA, AMD, and Intel GPUs
Event Type
Online Only
Heterogeneous Systems
Parallel Programming Languages and Models
Productivity Tools
Software Engineering
Registration Categories
TimeSunday, 14 November 20212:30pm - 3pm CST
DescriptionIn this paper, we introduce a GPU-friendly parallel implementation of Milc-Dslash that exposes multiple hierarchies of parallelism in the algorithm. Milc-Dslash was designed to serve as a benchmark with highly optimized matrix-vector multiplications to measure the resource utilization on the GPU systems. The parallel hierarchies in the Milc-Dslash algorithm are mapped onto a target hardware using Kokkos and SYCL programming models. We present the performance achieved by Kokkos and SYCL implementations of Milc-Dslash on NVIDIA A100 GPU, AMD MI100 GPU, and Intel Gen9 GPU. Additionally, we compare the Kokkos and SYCL performances with those obtained from the versions written in CUDA and HIP programming models on NVIDIA A100 GPU and AMD MI100 GPU, respectively.
Back To Top Button