Presentation

· Contributors · Organizations · Search Program

Towards an Efficient Parallel Skeleton for Generic Iterative Stencil Computations in Distributed GPUs

SessionResearch Posters Display

Authors

Manuel de Castro

Inmaculada Santamaria-Valenzuela

Sergio Miguel-López

Yuri Torres

Arturo Gonzalez-Escribano

Event Type

Posters

Research Posters

Registration Categories

TimeThursday, 18 November 20218:30am - 5pm CST

LocationSecond Floor Atrium

DescriptionIterative stencil applications present a high degree of parallelism. The approach is appropriate for modern many-core systems, like GPUs. The huge arrays needed in many real problems, however, require multiple-GPUs memory. Manually programming multi-GPU solutions is complex, involving synchronization, communication and optimization problems. Previous specific frameworks for multi-GPU stencils present limitations on stencil types and target devices. Another approach is to build efficient solutions using modern portable heterogeneous programming models. We present a high-productivity parallel programming skeleton implemented as a C library function. It can derive the communication structure and kernel details from an abstract specification of the stencil pattern, splitting the computation to better overlap it with communications. Preliminary experimental results show a good scalability in a cluster with up to 16 GPUs. It also outperforms a state-of-the-art solution based on Celerity/SYCL. The presentation will include a short demo video using the artifact, and both on-site and on-line discussion.

Archive view

Authors

Manuel de Castro

University of Valladolid, Spain

Inmaculada Santamaria-Valenzuela

University of Valladolid, Spain

Sergio Miguel-López

University of Valladolid, Spain

Yuri Torres

University of Valladolid, Spain

Arturo Gonzalez-Escribano

University of Valladolid, Spain

No Travel? No Problem.