AWS Greengrass ML Inference

Run machine learning models on AWS Greengrass devices

AWS Greengrass is software that lets you run local compute, messaging, data caching, and sync capabilities for connected devices in a secure way. With AWS Greengrass, connected devices can run AWS Lambda functions, keep device data in sync, and communicate with other devices securely – even when not connected to the Internet. Now, with the Greengrass Machine Learning (ML) Inference capability, you can also easily perform ML inference locally on connected devices.

Machine Learning works by using powerful algorithms to discover patterns in data and construct complex mathematical models using these patterns. Once the model is built, you perform inference by applying new data to the trained model to make predictions for your application. Building and training ML models requires massive computing resources so it is a natural fit for the cloud. But, inference takes a lot less compute power and is typically done in real-time when new data is available, so getting inference results with very low latency is important to making sure your applications can respond quickly to local events.  

AWS Greengrass ML inference gives you the best of both worlds. You use ML models that are built and trained in the cloud and you deploy and run ML inference locally on connected devices. For example, you can build a predictive model in the cloud about the diamond heads of boring equipment and then run it underground where there is no cloud connectivity to predict the wear and usage of the diamond.


Easily run ML Inference on Connected Devices

Performing inference locally on connected devices reduces the latency and cost of sending device data to the cloud to make a prediction. Rather than sending all data to the cloud for ML inference, ML inference is performed right on devices and data is sent to the cloud only when it requires more processing. For example, you can use AWS Greengrass ML Inference to run voice detection models on smart audio systems so they can respond to commands like lowering or raising volumes and turning on and off. They can respond without sending huge volumes of voice data to the cloud. The audio system can send data to the cloud only when additional action is needed like ordering a new song.


You can pick an ML model for each of your connected devices built and trained using the Amazon SageMaker service or stored in Amazon EC2. AWS Greengrass ML Inference works with Apache MXNet, TensorFlow, Caffe2, and CNTK. For NVIDIA Jetson, Intel Apollo Lake devices, and Raspberry Pi, AWS Greengrass ML Inference includes a pre-built Apache MXNet package so you don’t have to build or configure the ML framework for your device from scratch. We will also have a prebuilt package for TensorFlow coming soon.

Transfer Models to Your Connected Device with a Few Clicks

AWS Greengrass ML Inference makes it easy to transfer your machine learning model from the cloud to your devices with just a few clicks in the Greengrass console. From the Greengrass console, you can select the desired machine learning model along with AWS Lambda code and it will then be deployed and run on connected devices.

Accelerate Inference Performance with GPUs

AWS Greengrass ML Inference gives you access to hardware accelerators, such as GPUs on your devices, by including the accelerator device as a Greengrass local resource in the Greengrass console.

How It Works

AWS Greengrass ML Inference - How It Works

Use Cases

Video Processing

AWS Greengrass ML Inference can be deployed on connected devices like security cameras, traffic cameras, body cameras, and medical imaging equipment to help them make predictions locally. With AWS Greengrass ML Inference, you can deploy and run ML models like facial recognition, object detection, and image density directly on the device. For example, a traffic camera could count bicycles, vehicles, and pedestrians passing through an intersection and detect when traffic signals need to be adjusted in order to optimize traffic flows and keep people safe.

Retail and Hospitality

Retailers, cruise lines, and amusement parks are investing in IoT applications to provide better customer service. For example, you can run facial recognition models on in-store cameras to locate VIP customers and give them white glove treatment, such as moving them to the front of checkout lines, greeting them by name at the door, and offering them special discounts. Cameras locate the customers and alert customer service staff without having to send massive amounts of video data to the cloud, which is often a problem in large stores with poor cloud connectivity.


Connected devices are big targets for hackers. To combat this threat, you can create a behavioral model that predicts how a device should act. You can then use AWS Greengrass ML Inference to run and deploy the behavioral models directly on devices to detect deviations from normal behavior that could indicate an attack. When abnormal behavior is detected, the device can send the required data to AWS for further processing and action such as pushing a security fix.

Get Started with AWS


Sign up for an AWS account

Instantly get access to the AWS Free Tier.

Learn with 10-minute Tutorials

Explore and learn with simple tutorials.

Start building with AWS

Begin building with step-by-step guides to help you launch your AWS project.

Learn more about AWS Greengrass ML Inference by joining the preview

Sign Up for the Preview
Ready to get started?
Sign up
Have more questions?
Contact us