Tuesday, 16 November 2021
LocationSecond Floor Atrium
Code Generation and Optimization for Deep-Learning Computations on GPUs via Multi-Dimensional Homomorphisms
Learning-Based Content Delivery in 5G-Enabled Multi-Access Edge Computing
Parallel Framework for Updating Large-Scale Dynamic Networks
Chaining Multiple Tools and Libraries Using Gotcha
FPGA-Accelerated Ripples
HDF5 VOL Connector to Apache Arrow
Fargraph: Optimizing Graph Workload on RDMA-Based Far Memory Architecture
Breadth-First Search on Xilinx Versal
Open-Source High-Performance Computing for Applications in Engineering: DEM, SPH and Multi-Agent Vehicle Simulations with Project Chrono
Parallel GRB Source Localization Pipelines for the Advanced Particle-Astrophysics Telescope
Accelerating Parallel Monte Carlo Simulations for Statistical Physics: Portability on Many-Core Processors
Hardware Acceleration of Complex Machine Learning Models through Modern High-Level Synthesis
An Interactive GPU Metric Dashboard for HPC clusters
Towards Optimal Graph Coloring Using Rydberg Atoms
Analyzing Complex Memory Systems
Support in OpenMP for Multi-GPU Parallelism
Towards a Scalable and Distributed High-Performance SHAD C++ library
Multiple Same Level and Telescoping Nesting in GFDL’s FV3
Optimizing and Extending the Functionality of EXARL for Scalable Reinforcement Learning
Detecting Network Intrusion Anomalies through Egonet-Based Data Mining with Apache Spark
Padding to Extend the Bruck Algorithm for Non-Uniform All-to-All Communication
GASNet-EX Memory Kinds: Support for Device Memory in PGAS Programming Models
Hashed-Coordinate Storage of Sparse Tensors
Feature Reduction of Darshan Counters Using Evolutionary Algorithms
RIKEN CGRA: Data-Driven Architecture as an Extension of Multicore CPU for Future HPC
Core-Idling on MPI Intra-Node Communication Channels for Energy Efficiency
Performance Analysis of Containerized OrangeFS in HPC Environment
Handling C++ Exceptions in MPI Applications
Monitoring Urban Changes with Ensemble of Neural Networks and Deep-Temporal Remote Sensing Data
Utilizing Persistent Memory in Parallel I/O Libraries
SODA-OPT: System-Level Design in MLIR for HLS
Fusion Research Using Azure A100 HPC instances
HyperQueue: Overcoming Limitations of HPC Job Managers
Accelerating the Visualization of Spatio-Temporal Simulations with Non-Evolving Meshes
Enabling Combustion Science Simulations for Future Exascale Machines
Capturing Relationships Based on Structure Similarity for Self-Describing Scientific Data Formats
Towards an Efficient Parallel Skeleton for Generic Iterative Stencil Computations in Distributed GPUs
Accurate Throughput Prediction of Basic Blocks on Recent Intel Microarchitectures
Similarity Measurement for Proxy Application Fidelity
A Fast Parameter-Free Preconditioner for Structured Grid Problems
Detecting and Identifying Applications by Job Signatures
Flexible GMRES with Analog Accelerators
Heterogeneous Computing for Undergraduates: A Module-Driven Approach
