TY - GEN
T1 - Orchestration of dockerized data reduction pipelines from a RESTful web service
AU - Miszalski, Brent
AU - O'Toole, Simon
AU - Tocknell, James
AU - Harischandra, Lloyd
AU - Mannering, Elizabeth
AU - Sealey, Katrina
PY - 2021
Y1 - 2021
N2 - Data reduction pipelines are traditionally run on a researcher's personal computer on a small amount of data. The pipeline may have complex software dependencies that preclude the researcher from installing it on a faster server. Even if the pipeline runs on the server, its deployment in a multiprocessing environment may be problematic. Here we present a modern solution to allow for the on demand reduction of the ever-increasing volume of data stored in telescope archives. We have developed a Python web service that accepts 2dF-AAOmega observations and determines the steps needed to reduce the data. Each step runs 2dFdr commands from a Docker container on a fast server. We use docker-py to remotely execute these commands from within Celery tasks, allowing for a robust, configurable data reduction workflow to be assembled and executed asynchronously by Celery across several processors. Data Central plans to offer the service to users when requesting data from the newly revamped AAT archive, allowing effortless access to freshly reduced data. The service is extensible to other pipelines and would form a solid basis for developing IVOA SODA services, while slight modifications could unlock quick turnaround reductions of transient triggered observations....
AB - Data reduction pipelines are traditionally run on a researcher's personal computer on a small amount of data. The pipeline may have complex software dependencies that preclude the researcher from installing it on a faster server. Even if the pipeline runs on the server, its deployment in a multiprocessing environment may be problematic. Here we present a modern solution to allow for the on demand reduction of the ever-increasing volume of data stored in telescope archives. We have developed a Python web service that accepts 2dF-AAOmega observations and determines the steps needed to reduce the data. Each step runs 2dFdr commands from a Docker container on a fast server. We use docker-py to remotely execute these commands from within Celery tasks, allowing for a robust, configurable data reduction workflow to be assembled and executed asynchronously by Celery across several processors. Data Central plans to offer the service to users when requesting data from the newly revamped AAT archive, allowing effortless access to freshly reduced data. The service is extensible to other pipelines and would form a solid basis for developing IVOA SODA services, while slight modifications could unlock quick turnaround reductions of transient triggered observations....
M3 - Conference proceeding contribution
SN - 9781583819517
SN - 9781583819333
T3 - Astronomical Society of the Pacific Conference Series
SP - 265
EP - 268
BT - Astronomical Data Analysis Software and Systems XXX
A2 - Ruiz, Jose Enrique
A2 - Pierfedereci, Francesco
A2 - Teuben, Peter
PB - Astronomical Society of the Pacific
CY - San Francisco, California
ER -