No Travel? No Problem.

Remote Participation
Evaluation of Two Topology-Aware Heuristics on Level-3 BLAS Library for Multi-GPU Platforms
Event Type
Workshop
Tags
Online Only
Applications
Extreme Scale Comptuing
Heterogeneous Systems
Parallel Programming Languages and Models
Software Engineering
Registration Categories
W
TimeFriday, 19 November 202110:50am - 11:10am CST
LocationOnline
DescriptionNowadays GPUs have dominated the market considering the computing/power metric and numerous research works have provided Basic Linear Algebra Subprograms implementations accelerated on GPUs. Several software libraries have been developed for exploiting performance of systems with accelerators, but the real performance may be far from the platform peak performance with multiple GPUs. This paper presents two runtime heuristics to gain in performance when task based programs are performed on heterogeneous architecture such as multi-GPU systems. The first is a topology-aware policy to takes into account the heterogeneity of the high speed links that interconnect GPUs. The second is an optimistic heuristic that favors communication between devices. These have been implemented in the XKBLAS library BLAS-3 library. We made experiments on a NVIDIA DGX-1 with up to 8 GPUs V100 on a set of Basic Linear Algebra Subroutines. Experimental results on kernels showed that XKBlas outperformed most implementations including the overhead of creation and scheduling of dynamic tasks.
Back To Top Button