AWS Blog

Earth Science on AWS with new CGIAR and Landsat Public Data Sets

by Jeff Barr | on | in Public Data Sets | | Comments

To support the growing number of Earth science researchers on AWS, we are adding two new members to the collection of Public Data Sets on AWS: CGIAR Global Circulation Models (GCM) data and imagery from the Landsat 8 satellite.

If you have been a regular reader of this blog, you may recall some of our earlier work to encourage Earth science research. For example, last year we announced that you can Process Earth Science Data on AWS With NASA / NEX Public Data Sets. Earlier this year we announced the Amazon Climate Research Grants and subsequently made 12 awards (I’ll have more to say about the recipients and the results in a bit).

Supporting Climate Research with CGIAR Data
We are working with CGIAR (a consortium of international agricultural research centers) to make their data more accessible and more easily available, with the expectation that it will lead to innovative ways to address critical food security and development challenges. We expect worldwide public access to this data to help researchers address rural poverty, improve human health & nutrition, and manage the Earth’s natural resources in a sustainable fashion. Earth’s climate is changing. We believe that it is important to understand how the changes will affect agriculture and the world’s ability to feed its ever-growing population. By making CGIAR’s Global Circulation Models (GCM) available, we are giving researchers what is presently believed to be the most important tool for understanding how the climate could change in the next hundred years. Making this data available in the Cloud will allow developers to build applications that give non-experts the ability to access information about current and future climates in visual fashion.

The GCM data comes from the CCAFS Climate Portal and is stored in Amazon S3 at s3://cgiardata (refer to the CCAFS-Climate Data page for more info). There’s about 6 TB of data in the bucket, spread out over 66,000 or so files in ESRI Grid and ARC ASCII GRID format, all zipped. You can download the desired data to an EC2 instance using the AWS Command Line Interface (CLI) or the AWS Tools for Windows PowerShell. The GCM Documentation contains additional information about the structure of the data. The following diagram (click for a larger copy) will help you to identify the files that you need:

Coming in Early 2015 – Landsat Imagery
Landsat (pictured at right, courtesy of NASA Earth Observatory) is a program managed by United States Geological Survey (USGS) that creates moderate-resolution satellite imagery of all land on Earth every 16 days. The Landsat program has been running since 1972 and is the longest ongoing project to collect such imagery. Because of Landsat’s global purview and long history, it has become a reference point for all Earth observation work and is considered the gold standard of aerial imagery. It is the basis for research and applications in many global sectors, including agriculture, cartography, geology, forestry, regional planning, and Earth science education.

In support of the White House’s Climate Data Initiative, we have committed to make up to a petabyte of Landsat earth imagery data from the USGS widely available as an AWS Public Data Set. In early 2015, new imagery produced by the Landsat 8 satellite will be available for anyone to access via Amazon S3. By making Landsat data readily available near our flexible computing resources, we hope to accelerate innovation in climate research, humanitarian relief, and disaster preparedness efforts around the world. Because the imagery will be available in the cloud, researchers will be able to use whatever tools they want to perform analysis without needing to worry about storage or bandwidth costs. Take a look at the post Putting Landsat 8’s Bands to Work on the MapBox Blog to see what can be done with this data.

We are currently looking for partners who are interested in contributing expertise, open source tools, and educational materials that will help to accelerate climate research using Landsat on AWS. If you are interested in helping out or if you would like to be notified when this data becomes available, please go here and fill out the form.