Understanding the I/O Impact on the Performance of High-Throughput Molecular Docking
Event Type
Workshop
Data Analytics
Data Management
File Systems and I/O
Storage
W
TimeMonday, 15 November 202110:55am - 11:20am CST
Location223
DescriptionHigh-throughput molecular docking is a data-driven simulation methodology to estimate millions of molecules' position and interaction strength (ligands) when interacting with a given protein site. Because of its data-driven nature, the high-throughput molecular docking performance depends on how fast we can ingest data into the processing pipeline and how efficiently we can write molecular docking results to a shared file. In this work, we characterize the I/O performance of a high-performance high-throughput molecular docking application, called Docker-HT, running on a supercomputer up to 512 computing nodes with two different parallel I/O configurations. We show that a tuned I/O configuration can improve the overall parallel efficiency from 71% to 90% on 512 nodes and identify and solve a performance degradation observed when running on 16 and 32 nodes.
