BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20211207T055402Z
LOCATION:Online
DTSTART;TZID=America/Chicago:20211115T153000
DTEND;TZID=America/Chicago:20211115T155500
UID:submissions.supercomputing.org_SC21_sess342_ws_worksp102@linklings.com
SUMMARY:The Benefits of Prefetching for Large-Scale Cloud-Based Neuroimagi
 ng Analysis Workflows
DESCRIPTION:Workshop\n\nThe Benefits of Prefetching for Large-Scale Cloud-
 Based Neuroimaging Analysis Workflows\n\nHayot-Sasson, Glatard, Rokem\n\nT
 o support the growing demands of neuroscience applications, researchers ar
 e transitioning to cloud computing for its scalable, robust and elastic in
 frastructure. Nevertheless, large datasets residing in object stores may r
 esult in significant data transfer overheads during workflow execution. Pr
 efetching, a method to mitigate the cost of reading in mixed workloads, ma
 sks data transfer costs within processing time of prior tasks.  We present
  an implementation of “Rolling Prefetch”, a Python library that implements
  a particular form of prefetching from AWS S3 object store, and we quantif
 y its benefits.\n\nRolling Prefetch extends S3Fs, a Python library exposin
 g AWS S3 functionality via a file object, to add prefetch capabilities. In
  measured analysis performance of a 500 GB brain connectivity dataset stor
 ed on S3, we found that prefetching provides significant speed-ups of up t
 o 1.86×, even in applications consisting entirely of data loading. The obs
 erved speed-up values are consistent with our theoretical analysis. Our re
 sults demonstrate the usefulness of prefetching for scientific data proces
 sing on cloud infrastructures and provide an implementation applicable to 
 various application domains.\n\nTag: Online Only, Cloud and Distributed Co
 mputing, Scientific Computing, Workflows\n\nRegistration Category: Worksho
 p Reg Pass
END:VEVENT
END:VCALENDAR
