Authors: Christine Kirkpatrick (San Diego Supercomputer Center (SDSC)), Alex Szalay (Johns Hopkins University), John Goodhue (Massachusetts Green High Performance Computing Center (MGHPCC)), Kenton McHenry (University of Illinois), Julie Ma (Massachusetts Green High Performance Computing Center (MGHPCC))
Abstract: The Open Storage Network (OSN) is an NSF-funded distributed data sharing service intended to facilitate exchanges of active scientific data sets between research organizations, providing easy access and high bandwidth delivery of large data sets to researchers. Since its inception in 2017, the OSN has been prototyping its service offerings with a community of friendly researchers. In fall 2020, OSN transitioned to a production-level pilot and began marketing both its services and the opportunity to participate in the network. In January 2021, OSN became a resource allocatable through the XSEDE XRAS process, where users can request allocations of 1TB-50TB.
Long Description: The Open Storage Network (OSN) is an NSF-funded distributed data sharing service
intended to facilitate exchanges of active scientific data sets between research
organizations, communities and projects, providing easy access and high bandwidth
delivery of large data sets to researchers.
The OSN serves two principal purposes: (1) enable the smooth flow of large data sets
between resources such as instruments, campus data centers, national supercomputing
centers, and cloud providers; and (2) facilitate access to long tail data sets by the
scientific community. Examples of data currently available on the OSN include synthetic
data from ocean models; the widely used Extracted Features Set from the Hathi Trust
Digital Library; open access earth sciences data from Pangeo; and Geophysical Data
from BCO-DMO. These data sets are being used by researchers to train machine learning
models, validate simulations, and perform statistical analysis of live data. The target OSN
user community is well-represented by SC attendees.
OSN data is housed in storage pods, each providing up to a petabyte of storage, and
interconnected by national, high-performance networks, creating well-connected,
cloud-like storage that is easily accessible at high data transfer rates comparable to or
exceeding the public cloud storage providers, where users can temporarily park data for
retrieval by a collaborator or create a repository of active research data. OSN leverages
Ceph, commodity hardware, Ansible, and other open source methodologies and
technologies. OSN pods are typically located in Science DMZs at the host institution and
secure access is provided via the Incommon Federation.
Since its inception in 2017, the OSN has been prototyping its service offerings with a
community of friendly researchers while building out the network of pods. The Schmidt
Foundation provided funding for first prototypes.
In fall, 2020, OSN transitioned to a production-level pilot and began marketing both its
services and the opportunity to participate in the network through the purchase of storage
pods to the research computing community, beginning with a four-part webinar series that
culminated in April 2021 with a session that attracted over 450 registrations.
In January 2021, OSN also became a resource allocatable through the XSEDE XRAS
process, where users can request startup allocations of 1TB-10TB and production
allocations of 1TB-50TB. Allocations greater than 50TB and up to 300 TB can be
requested by contacting the OSN team directly.
This BoF will begin with a brief update on international trends and the state
of the art in research storage, followed by details about the OSN, how to use it and how
to get involved. This will be followed by brief presentations from two users. Finally we will solicit feedback from BoF participants about their research storage needs, the suitability of OSN to address these needs, and any barriers they see to use.
This BoF is suitable for in person or remote participation. If selected and presented live, we will include remote participation via zoom to encourage broader attendance.
URL: https://www.openstoragenetwork.org/
Back to Birds of a Feather Archive Listing