Amazon Robotics Uses Amazon SageMaker and AWS Inferentia to Enable ML Inferencing at Scale
Software Engineer, Amazon Robotics
Building an ML Model to Replace Manual Scanning
Amazon Robotics uses its software and machinery to automate the flow of inventory in Amazon fulfillment centers. There are three main physical components to the company’s system: mobile shelving units, robots, and employee workstations. The robots deliver mobile shelving units to stations, and employees either put inventory in (stowing) or take it out (picking). “Our existing stow-and-pick workflows can sometimes create a bottleneck for downstream processing,” says Eli Gallaudet, a senior software manager at Amazon Robotics. “In 2017, we kicked off an initiative to figure out how to make some of those workflows simpler.”
Looking to reduce time-consuming bin scanning, Amazon Robotics built the Intent Detection System, a deep-learning-based computer vision system trained on millions of video examples of stowing actions. The company wanted to train the system to automatically identify where associates place inventory items. Knowing it would need cloud compute to deploy the deep-learning models to Amazon fulfillment centers, Amazon Robotics turned to AWS. The team deployed its models to Docker containers, hosting them using Amazon Elastic Container Service (Amazon ECS), a fully managed container orchestration service.
Once the team had collected enough video examples of stowing actions, it experimented with applying model architectures to the large annotated video dataset. After several iterations, the team could begin letting the deployed models automate the process.
Shifting Hosting and Management to Amazon SageMaker
Although Amazon Robotics could tap into ample compute resources on AWS, the company still had to handle hosting itself. When AWS announced the release of Amazon SageMaker at AWS re:Invent 2017, Amazon Robotics quickly adopted it, avoiding the need to build a costly hosting solution of its own. Amazon Robotics was the first company to deploy to Amazon SageMaker on a large scale and remains one of the largest deployments as of January 2021.
At first the team primarily used Amazon SageMaker to host models. Amazon Robotics adapted its service usage as needed, initially using a hybrid architecture and running some algorithms on premises and some on the cloud. “We built a core set of functionalities that enabled us to deliver the Intent Detection System,” says Tim Stallman, a senior software manager at Amazon Robotics. “And then as Amazon SageMaker features came online, we slowly started adopting those.” For example, the team adopted Amazon SageMaker Experiments—a capability that enabled the team to organize, track, compare, and evaluate ML experiments and model versions.
Amazon Robotics also used Amazon SageMaker automatic scaling. “Amazon SageMaker doesn’t just manage the hosts we use for inferencing,” says Gallaudet. “It also automatically adds or removes hosts as needed to support the workload.” Because it doesn’t need to procure or manage its own fleet of over 500 GPUs, the company has saved close to 50 percent on its inferencing costs.
Reaping the Benefits of a Managed Solution and AWS Inferentia
Amazon Robotics has seen considerable success. The company has used Amazon SageMaker to reduce time spent on management and to balance the ratio of scientists to software development engineers. Amazon SageMaker also enabled the system to scale horizontally during its rollout across the Amazon fulfillment network—and the team is confident that Amazon SageMaker can handle its peak inference demands.
This solution is backed by Amazon Elastic Compute Cloud (Amazon EC2), which provides secure, resizable compute capacity in the cloud and enables users to quickly migrate host types as newer host types become available. The Amazon Robotics team was able to reduce their inference costs by 20 percent by migrating from Amazon EC2 P2 Instances to Amazon EC2 G4 Instances. Now utilizing AWS Inferentia, the Amazon Robotics team is able to further reduce inference costs by 35 percent over G4 instances (over 50 percent reduction from P2 instances) and Inferentia has delivered 20 percent higher throughput, allowing them to scan more packages a day without requiring more resources. “Our system will use more than 1000 SageMaker hosts in 2022 and AWS Inferentia helps us to serve the rapidly growing traffic at higher throughput without re-training our ML models," says Pei Wang, a software engineer at Amazon Robotics.
The Amazon SageMaker–powered solution grew rapidly after its initial deployment. The Amazon Robotics team started implementing the solution on a small scale at a fulfillment center in Wisconsin and rapidly expanded to dozens more. As the solution grew, Amazon SageMaker quickly and seamlessly scaled alongside it. “We expect to almost double our volume in 2022,” says Gallaudet.
Continuing a Steady March of Innovation
The team sees many other opportunities to experiment on AWS, including running its models on the edge using Amazon SageMaker Edge Manager, which efficiently manages and monitors ML models across fleets of smart devices. Amazon Robotics also expects to build models that can further automate package tracking and help automate package damage assessment.
By experimenting with cutting-edge technology, Amazon Robotics continues to increase efficiency in fulfillment centers and improve the Amazon customer experience. “Many of the techniques that we’ve learned and experiences we’ve had with the Intent Detection System have directly enabled us to move quickly on these projects,” says Stallman.
About Amazon Robotics
Benefits of AWS
- Saved nearly 50 percent on inferencing costs
- Improved computing performance rate by 40 percent
- Saved 20 percent on compute costs by rightsizing Amazon EC2 instances
AWS Services Used
Amazon EC2 is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.
Amazon EC2 G4 Instances
Amazon EC2 G4 instances are the industry’s most cost-effective and versatile GPU instances for deploying machine learning models such as image classification, object detection, and speech recognition, and for graphics-intensive applications such as remote graphics workstations, game streaming, and graphics rendering.
Amazon ECS is a fully managed container orchestration service. Customers such as Duolingo, Samsung, GE, and Cookpad use ECS to run their most sensitive and mission critical applications because of its security, reliability, and scalability.
Amazon SageMaker helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning (ML) models quickly by bringing together a broad set of capabilities purpose-built for ML.
More Amazon Stories
Amazon's Cloud Journey on AWS
Amazon.com migrated 5,000 databases from Oracle to AWS, cutting its annual database operating costs by more than half and reducing the latency of most critical services by 40 percent. The company leverages Amazon DynamoDB to scale Prime Video and Amazon SageMaker to indentify misplaced inventory.
How Amazon is Achieving Database Freedom Using AWS
Migrating to AWS has cut Amazon's annual database operating costs by more than half. Database-administration and hardware-management overhead have been greatly reduced, and cost allocation across teams is much simpler than before.
Amazon Fulfillment Technologies Aurora Case Study
Amazon migrated its Inventory Management Services from Oracle Database to Amazon Aurora to improve availability and scalability and reduce its operational burden. Amazon is the world's leading online retailer and provides a wide range of cloud services through its AWS division.
Amazon Reduces Infrastructure Costs on Visual Bin Inspection by a Projected 40 percent Using Amazon SageMaker
Amazon Fulfillment Technologies migrated from a legacy custom solution for identifying misplaced inventory to Amazon SageMaker, reducing AWS infrastructure costs by a projected 40 percent per month and simplifying its architecture.
Amazon Prime Video Uses AWS to Deliver Solid Streaming Experience to More Than 18 Million Football Fans
Amazon Video selected AWS Elemental, an AWS company that combines deep video expertise with the power and scale of the cloud to empower media companies to deliver premium video experiences to consumers.
Prime Video Boosts Scale and Resilience Using Amazon DynamoDB
As part of its strategy to build a platform that could scale to meet projected needs for at least 10 years, Amazon decided to migrate Prime Video to AWS. The migration would replace CQS with a suite of 12 microservices, built using a range of AWS services that included Amazon DynamoDB, AWS Lambda, and Amazon Simple Queue Service (Amazon SQS).
Amazon.com Case Study
Amazon.com is the world's largest online retailer. In 2011, Amazon.com switched from tape backup to using cloud-based Amazon S3 for backing up the majority of its Oracle databases. By using AWS, Amazon.com was able to eliminate backup software and experienced a 12X performance improvement, reducing restore time from around 15 hours to 2.5 hours in select scenarios.
Amazon Advertising Case Study
Amazon Advertising Engineering and Development (AED) increased throughput and avoided a three-year project to rebuild performance monitors and management tools by shifting data held in Oracle databases to Amazon RDS. AED builds, manages, and scales the technologies that undergird Amazon's programmatic advertising offerings.
Amazon.com Buyer Fraud Service Gains Scalability, Cuts Costs in Half Using AWS
Amazon Transaction Risk Management Services (TRMS) migrated more than 100 on-premises Oracle databases to AWS. Because the Buyer Fraud Service is a critical application and must operate at 99.995 percent availability, TRMS decided to use PostgreSQL-compatible Amazon Aurora as the new platform to host its databases.
Amazon CloudWatch Delivers Metrics to Customers Faster, Saves Millions Annually Using Amazon DynamoDB TTL
Amazon CloudWatch (CloudWatch) is a service used by AWS customers to monitor their cloud resources and applications running on AWS.
Scaling Amazon.com’s Transactional Subledger using Amazon DynamoDB
Across its many business entities, Amazon uses the Financial Ledger and Accounting Systems Hub (FLASH) to process more than 20 billion financial transactions each month. In 2017, the FLASH organization decided to migrate all its Oracle databases to AWS services.
Amazon Speeds Desktop Deployment for 25,000 Global Workers Using Amazon WorkSpaces
Amazon's Client Engineering team uses Amazon WorkSpaces to give global workers fast, restricted access to corporate resources, onboard new users in minutes, and save over $17 million annually.
To learn more, visit aws.amazon.com/sagemaker.