Presentation

· Contributors · Organizations · Search Program

Semantic-Aware Lossless Data Compression for Deep Learning Recommendation Model (DLRM)

Session7th Workshop on Machine Learning in High Performance Environment

Author/Presenters

Sarunya Pumma

Abhinav Vishnu

Event Type

Workshop

Tags

Registration Categories

TimeMonday, 15 November 20213:25pm - 3:55pm CST

LocationOnline

DescriptionDeep Learning Recommendation Model (DLRM), a new neural network for recommendation systems, introduces challenging requirements for deep neural network training and inference. The size of the DLRM model is typically large and not able to fit on a single GPU memory. DLRM requires both model-parallel and data-parallel for the bottom part and top part of the model when running on multiple GPUs. Due to the hybrid-parallel model, the all-to-all communication is used for welding the top and bottom parts together. We have observed that the all-to-all communication is costly and is a bottleneck in the DLRM training/inference.

In this presentation, we reduce the communication volume by using DLRM's properties to compress the transferred data without information loss. We demonstrate benefits of our method by training DLRM TeraByte on AMD Instinct MI100 accelerators. The experimental results show 38%-59% improvement in the time-to-solution of the DLRM TeraByte training for FP32 and mixed-precision.

Author/Presenters

Sarunya Pumma

Advanced Micro Devices (AMD) Inc

Abhinav Vishnu

Advanced Micro Devices (AMD) Inc

No Travel? No Problem.