Start Using Landsat on AWS
My colleague Jed Sundwall wrote the guest post below to show you how one of the newest AWS Public Data Sets is being put to use.
You can now access over 85,000 Landsat 8 scenes through our newest Public Data Set: Landsat on AWS. The scenes are all available in the
landsat-pds bucket in the Amazon S3 US West (Oregon) region.
Landsat is an earth observation program conducted in partnership by the U.S. Geological Survey (USGS) and NASA that creates moderate-resolution satellite imagery of all land on Earth every 16 days. The Landsat program has been running since 1972 and is the longest ongoing project to collect such imagery. Landsat 8 is the newest Landsat satellite and it gathers data based on visible, infrared, near-infrared, and thermal-infrared light.
Because of Landsat’s global purview and long history, it has become a reference point for all Earth observation work and is considered the gold standard of natural resource satellite imagery. It is the basis for research and applications in many global sectors, including agriculture, cartography, geology, forestry, regional planning, surveillance and education. Many of our customers’ work couldn’t be done without Landsat.
As we said in December, we hope to accelerate innovation in climate research, humanitarian relief, and disaster preparedness efforts around the world by making Landsat data readily available near our flexible computing resources. We have committed to host up to a petabyte of Landsat data as a contribution to the White House’s Climate Data Initiative. Because the imagery is available on AWS, researchers and software developers can use any of our on-demand services to perform analysis and create new products without needing to worry about storage or bandwidth costs.
You can learn more about how to access the data on our Landsat on AWS page.
What’s possible with Landsat on AWS
We’ve been testing our approach to hosting Landsat imagery over the past few months and have been amazed by what people have been able to do with it.
Development Seed has updated the popular open source landsat-util library to use data from Landsat on AWS. Now developers who rely on landsat-util can access Landsat data more quickly and with more processing options. Learn more about the updates to landsat-util. Here’s a screen shot of their Libra image browser:
Esri has created a demonstration of how ArcGIS Online can quickly visualize Landsat data for visualization and analysis within the browser. Visit Esri’s site to see how powerful and beautiful Landsat data can be.
Mapbox is using Landsat on AWS to power Landsat-live, a map that is constantly refreshed with the latest imagery from NASA’s Landsat 8 satellite. Learn more about Landsat-live. This map, created by Mapbox and named “Landsat Live”, offers the freshest Landsat imagery possible on a global level. Mapbox street data is overlaid on top to show as much context as possible:
MathWorks has created a freely-downloadable tool for accessing, processing, and visualizing Landsat data in MATLAB. With this tool, you can create a map display of scene locations with markers that show each scene’s metadata. Learn more about the tool and watch a demo video of it on the MathWorks blog.
Planet Labs uses Landsat data for image rectification and as a reference point for its own Earth observing satellites. Learn how Planet Labs uses Landsat on AWS to quickly create better products for its customers.
Left: a Landsat image of the Lower Se San 2 Dam in Cambodia taken on December 22, 2014. Right: A Planet Labs image of the dam taken less than a month later on January 14, 2015.
Accessing the Landsat Data
Rather than hosting each Landsat scene as a .tar archive that contains each of the scene’s 12 bands and metadata, we make each band of each scene is available as a stand-alone GeoTIFF and the scene’s metadata is hosted as a text file as well as a JSON file.
The data are organized using a directory structure based on each scene’s path and row. For instance, the files for Landsat scene LC80030172015001LGN00 are available in the following location:
The “L8” directory refers to Landsat 8, “003” refers to the scene’s path, “017” refers to the scene’s row, and the final directory matches the scene’s identifier. This identifier takes the form LXSPPPRRRYYYYDDDGSIVV and is segmented as follows:
- L = Landsat
- X = Sensor
- S = Satellite
- PPP = WRS path
- RRR = WRS row
- YYYY = Year
- DDD = Julian day of year
- GSI = Ground station identifier
- VV = Archive version number
In this case, the scene corresponds to WRS path 003, WRS row 017, and was taken on the 1st day of 2015.
Each scene’s directory includes:
.TIFGeoTIFF for each of the scene’s bands (the GeoTIFFs include 512×512 internal tiling and there can be up to 12 bands),
.TIF.ovroverview file for each .TIF (useful in GDAL based applications),
- A small RGB preview JPEG (3% of the original size),
- A large RGB preview JPEG (15% of the original size),
index.htmlfile that can be viewed in a browser to see the RGB preview, and
- Links to the GeoTIFFs and metadata files.
For instance, the files associated with scene LC80030172015001LGN00 are available at:
A gzipped CSV describing all available scenes is available at:
If you use the AWS Command Line Interface (AWS CLI), you can access the bucket with this simple shell command:
$ aws s3 ls landsat-pds
We’d like to thank our customers at Development Seed, Esri, Mapbox, MathWorks, and Planet Labs who helped us launch and test this public data set. New collaborators are welcome to contribute to the scripts we use to acquire and process Landsat data on GitHub:
Collaborators are welcome to contribute to these scripts on GitHub.
— Jed Sundwall, Open Data Technical Business Manager