This Guidance demonstrates how software developers can use an Amazon SageMaker Notebook instance to directly train and evaluate AWS DeepRacer models with full control. This includes augmenting the simulation environment, manipulating inputs to the neural network, modifying neural network architecture, running distributed rollouts, and debugging their model. The AWS DeepRacer console is optimized to provide a user-friendly introduction to reinforcement learning for developers new to machine learning. 

Please note: [Disclaimer]

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF 

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

  • The services you can configure in this Guidance enhance your operational excellence in a number of ways. First, SageMaker streamlines the process of training Reinforcement Learning (RL) models, while RoboMaker automates the creation of simulation environments and data generation. Second, Kinesis Video Streams allows real-time monitoring of model training and evaluation. And third, CloudWatch provides centralized logging and monitoring for all services involved, enabling efficient operations management.

    Read the Operational Excellence whitepaper 
  • CloudWatch offers centralized security monitoring and alerts to support efficient threat detection. Also, Amazon Virtual Private Cloud (Amazon VPC) with VPC endpoints ensures private and secure communication between SageMaker, RoboMaker, and Amazon S3, preventing exposure to the public internet.

    Read the Security whitepaper 
  • SageMaker ensures the reliability of model training by managing the implementation of training jobs, including fault tolerance and recovery. Moreover, Amazon S3 offers reliable storage for training data and model images, ensuring data availability and durability. RoboMaker contributes to reliability by creating and managing the simulation environment, enabling robust data generation for training. Also, Kinesis Video Streams stream live training and evaluation, allowing real-time monitoring for reliability assessment. It also provides capabilities in multiple Availability Zones. Finally, CloudWatch provides comprehensive logs, metrics, and operational insights, aiding in identifying and mitigating reliability issues promptly.

    Read the Reliability whitepaper 
  • The management capabilities of SageMaker streamline model training, utilizing compute resources efficiently along with right sizing the instance on which it is running. Also, SageMaker Notebook uses ml.t3.2xlarge and SageMaker training uses ml.c4.2xlarge instances–optimizing the performance of SageMaker for this Guidance. Additionally, RoboMaker enhances performance efficiency by creating and managing a simulation environment optimized for AWS DeepRacer training.

    Read the Performance Efficiency whitepaper 
  • SageMaker training jobs are sized to the workload and shut down when the training job is complete, helping you avoid unnecessary costs, and the clean up code for SageMaker Notebook further ensures efficient resource use by removing unnecessary components. Also, the automatic shutdown of RoboMaker reduces idle resource costs. RoboMaker also includes clean up code to delete resources, minimizing residual costs.

    Read the Cost Optimization whitepaper 
  • SageMaker supports sustainability by helping you to efficiently managing resources during RL model training, reducing energy consumption and your environmental impact. Right sizing of the underlying instance helps to optimize compute resources sustainably. Furthermore, Kinesis Video Streams enables real-time monitoring, helping you make informed decisions to optimize resource usage and minimize energy waste.

    Read the Sustainability whitepaper 

Implementation Resources

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

[Subject]
[Content Type]

[Title]

[Subtitle]
This [blog post/e-book/Guidance/sample code] demonstrates how [insert short description].

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.