Hubble Space Imagery on AWS: 28 Years of Data Now Available in the Cloud
Since going live in 1990, the Hubble Space Telescope has delivered groundbreaking images to broaden our understanding of the universe. Each image captured by the telescope is archived and made publicly available, free of cost, by NASA through the Space Telescope Science Institute (STScI).
The Hubble images archive is used by a global community of astronomers, researchers, and engineers and has led to the discovery of distant galaxies and nebulae. “The legacy is a treasure trove of data that can be mined in the future,” Arfon Smith, head of data science at STScI, said.
Addressing the Need for Computing
The community requires a massive amount of computing power for their scientific research. To improve researchers’ access to computing power, in May of 2018, STScI made over a decade of Hubble Space Telescope observation data available on Amazon Web Services (AWS) by transferring 110 TB of Hubble’s archival observations to Amazon Simple Storage Service (Amazon S3). By participating in the AWS Public Dataset program, STScI offers access to the data in the AWS Cloud, where the Hubble dataset joins a variety of other publicly available science datasets, including Landsat-8 imagery and the 1000 Genomes Project. All public data from the Hubble Space Telescope’s active instruments are available for large-scale analysis on Amazon S3. (Watch the video).
The AWS Public Dataset program covers the cost of storage for publicly available high-value cloud-optimized datasets. This helps democratize access to data by making it available for analysis on AWS and fosters scientific communities that benefit from access to shared data alongside low-cost computing power. The program also supports development of new cloud-native techniques, formats, and tools that lower the cost of working with data.
Cloud Capabilities Facilitating Research
Now that the dataset is available on AWS, astronomers, academicians, engineers, and the general public can process large volumes of Hubble imagery without downloading and storing their own copies of the data. By removing the burden of acquiring data, researchers can review and query the data more quickly.
Let’s consider an example of what it might cost for a researcher to process all of the 120,000 or so images collected over the last nine years from the Wide Field Camera 3 (WFC3/IR) aboard the Hubble Space Telescope. With AWS Lambda, developers and researchers can run code on AWS without having to worry about provisioning servers or planning infrastructure capacity, while only being charged for the compute time used to run the code. According to STScI, using Lambda, the entire collection of WFC3/IR images can be processed in about 2 minutes for an estimated cost of $2!
The availability of Hubble imagery alongside computing will give researchers the opportunity to break new ground. Even after the Hubble Telescope is decommissioned, the highly valuable dataset of space images – in the hands of researchers – will yield insights into our cosmos through the contributions of an empowered astronomy community.