Toward Scalable Data Processing in Python with CLIPPy
Event Type
Workshop
Algorithms
Architectures
Big Data
Data Analytics
Memory Systems
Numerical Algorithms
W
TimeMonday, 15 November 20214:50pm - 5:20pm CST
Location224
DescriptionThe Python programming language has become a popular choice for data scientists. While easy to use, the Python language is not well suited to drive data science on large scale systems.
This paper presents a first prototype of CLIPPy (Command line interface plus Python), a user-side interface in Python that connects to high-performance computing environments with non-volatile memory. CLIPPy queries available executable files and prepares a Python API on the fly. The executables offer an interface to a backend that can execute on large-scale systems. The executables can be implemented in any language, for example C++ . CLIPPy and the executables are loosely coupled and communicate through a JSON based interface.
The underlying philosophy, design challenges, and a prototype implementation that accesses data stored in non-volatile memory will be discussed.
This paper presents a first prototype of CLIPPy (Command line interface plus Python), a user-side interface in Python that connects to high-performance computing environments with non-volatile memory. CLIPPy queries available executable files and prepares a Python API on the fly. The executables offer an interface to a backend that can execute on large-scale systems. The executables can be implemented in any language, for example C++ . CLIPPy and the executables are loosely coupled and communicate through a JSON based interface.
The underlying philosophy, design challenges, and a prototype implementation that accesses data stored in non-volatile memory will be discussed.