SC21 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

StreamWare: A Scalable Framework for Accelerating Streaming Data Science


Authors: Viktor Prasanna (University of Southern California (USC)), David Bader (New Jersey Institute of Technology), David Brooks (Harvard University), Senjuti Basu Roy (New Jersey Institute of Technology), Zhihui Du (New Jersey Institute of Technology), Sanmukh Kuppannagari (University of Southern California (USC)), Xuehai Qian (University of Southern California (USC))

Abstract: This Birds-of-a-Feather session brings together a diverse community of interest around streaming data and the development of StreamWare, an open source framework supported in part by the NSF Principles and Practice of Scalable Systems (PPoSS) program. The session will discuss the architecture of StreamWare and the opportunities for cross-layer optimizations from the architecture and system up to the applications. The session will contain a brief overview of StreamWare and an open forum for community input from developers and users.

Long Description: Goals: The goals of the session are: (a) Build a community of developers and researchers who focus on streaming analytics. (b) Capture the needs and requirement of StreamWare – a framework that enables building of streaming data science applications targeting a wide variety of scalable heterogeneous systems.

Topic: In grand challenge applications, the enormous amount of data produced by the sensing and instrumentation infrastructure lose its value after a small window of time. Thus, to obtain actionable intelligence from the data, streaming analytics—the ability to analyze in-motion data—becomes increasingly critical. Moreover, heterogeneous architectures consisting of processors and accelerators, integrated large high-bandwidth external memory, and cache coherent interconnections, are becoming popular both at the data-center level as well as on edge devices. Thus, developing scalable streaming analytics applications targeting such heterogeneous computing platforms requires addressing challenges across the full system stack—from application to target platform.

StreamWare will allow developers to seamlessly build streaming data science applications targeting a wide variety of scalable systems. The fundamental approach of StreamWare is to perform optimizations spanning across layers. To enable cross-layer optimization, StreamWare will include novel cross-layer abstractions that facilitate information exchange between the layers. The application customization will be governed by performance models which facilitate design space exploration and an extensible knowledge base which will contain optimal parameterized algorithmic and hardware implementations. StreamWare will exploit multiple dimensions of heterogeneity at the application level as well as the target platform level. A novel notion of symbiotic scalability is defined to capture the impact of StreamWare cross layer optimizations as well as the graph structure of streaming applications. StreamWare will be evaluated using three or more high impact streaming applications in the domains of astrophysics, smart grid, and network science.

Relevance to the expected HPC Audience: StreamWare is envisioned to enable high-performance streaming applications that scale up to Exascale systems.

Expected Outcomes: The session organizers will document the session and disseminate an open report on it.


URL: https://streamware.org/


Back to Birds of a Feather Archive Listing