AWS Public Sector Blog

How DigitalGlobe Uses Amazon SageMaker to Manage Machine Learning at Scale

A guest post by Jay Littlepage, VP of Infrastructure and Operations, Maxar Technologies and VP of IT, DigitalGlobe

If you have ever searched for directions, called an Uber, or looked up a trailhead, you have used DigitalGlobe’s imagery or information derived from it. DigitalGlobe went all-in on AWS to meet the growing demand for commercial geo-intelligence, migrating its entire 18-year imagery archive to the cloud. The company used AWS Snowmobile to move 100 petabytes of data to the cloud, allowing it to move away from large file-transfer protocols and delivery workflows.

DigitalGlobe also wanted to provide on-demand access to its data, while still managing its AWS spend. They turned to machine learning to address their caching problem. They needed to train the caching algorithm to find relevance in customer access patterns. Are people looking for something in the same image or images nearby? Can we predict where the next access is likely to be? The result was: yes.

DigitalGlobe now uses Amazon SageMaker to handle machine learning at scale. Amazon SageMaker is a fully managed service that enables developers and data scientists to quickly and easily build, train, and deploy machine-learning models at any scale. Amazon SageMaker removes the barriers that typically slow down developers who want to use machine learning.

“Adopting AWS as our foundational storage and analytics infrastructure is allowing us to solve big data challenges for our customers,” said Dr. Walter Scott, Chief Technology Officer of Maxar Technologies and founder of DigitalGlobe. “We’ve gained speed, agility, and resiliency from the cloud, and our teams, partners, and customers can now innovate faster than ever before with our entire library at their fingertips. By leveraging an array of AWS machine learning services, we are able to put the world’s most important images into our customers’ hands, allowing them to better understand our changing planet and save lives, resources, and time.”

By using Amazon SageMaker, DigitalGlobe’s cache rate improved by more than a factor of two, often being around 83% and sometimes trending to 90% cache hit. This allowed them to also cut their cloud storage cost in half by better utilizing their S3 optimized cache and retrieving less from their 100+ PB Archive.


Watch this video from re:Invent 2017 of Dr. Walter Scott, CTO and founder at DigitalGlobe, describing how his team with little to no machine learning experience was able to achieve results in just two weeks.

Get started here.