AWS Solutions Library

AWS Solutions Library
Guidance for Machine Learning for Near Real-Time Advertising on AWS

Guidance for Machine Learning for Near Real-Time Advertising on AWS

Go to sample code

Overview

This Guidance helps AdTech users build, train, and deploy machine learning models to an ad auction server application. This helps reduce unanswered bid requests by decoupling model creation and model consumption into separate environments to enable independent scale and security measures.

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Download the architecture diagram

100 %

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

This Guidance is represented by two AWS Cloud Development Kit (AWS CDK) stacks. You can deploy changes to the application and infrastructure by applying the best practices from AWS CDKs. Programmatic access key, single sign-on, or federation are some of the authentication methods used in the AWS CDK CLI.

An Amazon CloudWatch dashboard provides business metrics for monitoring. You can configure CloudWatch alarms to meet your operational needs.

Read the Operational Excellence whitepaper

To improve privacy and security, Amazon Virtual Private Cloud (Amazon VPC) service endpoints are used. Managed services such as AWS Lambda and Amazon ECS are used to reduce the security maintenance tasks. Amazon S3 buckets used in this Guidance are encrypted and blocked from public access. Amazon Elastic Block Store (Amazon EBS) volumes are encrypted using the customer managed AWS Key Management Service (AWS KMS) key.

Read the Security whitepaper

This solution uses managed services Amazon EMR, SageMaker, and Amazon ECS to minimize the operational efforts for the solutions. Amazon EMR is designed to handle large-scale data processing. The model training is done on SageMaker that provides the configuration parameters to size the infrastructure according to the demand. The data processing and model training parts of this solution are run periodically in batches. The inference part must be near real-time and run as containers in Amazon ECS, allowing for multi-Availability Zone deployment and auto-scaling to enable high availability.

Read the Reliability whitepaper

For a hands-on experience, you can deploy the provided code in your account from an AWS Region of your choice. You would first use the sample data to start the data processing and machine learning process and then use a trained model to make inferences. To start tailoring this Guidance to your needs, you can incorporate your own data, modify the machine learning model training, and possibly adjust the inference part.

Read the Performance Efficiency whitepaper

This Guidance is designed to support managed services that can run in any AWS region without incurring additional licensing costs. Deploy to the AWS Region that best fits your needs. The components are managed individually to incur cost only when used. Data transformation uses Amazon EMR clusters to provide the best cost to performance ratio when processing large data sets. From SageMaker studio, clusters can be created and ended to manage costs. The inference component uses container-based deployment to match resource consumption with actual demand.

Read the Cost Optimization whitepaper

This Guidance focuses on shared storage and single sources of truth to avoid data duplication and reduce the total storage requirements of your workload. Only fetch data from shared storage if necessary. Detach unused volumes in order to make more resources available. You can change the utilized compute resources based on the actual need. Adjust provisioned resources based on the actual demand. To understand utilization and the right-size of deployed resources, you can use CloudWatch metrics.

Read the Sustainability whitepaper

Implementation resources

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

Open sample code on GitHub

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages

Guidance for Machine Learning for Near Real-Time Advertising on AWS

Overview

How it works

Well-Architected Pillars

Implementation resources

Disclaimer

Did you find what you were looking for today?

Learn

Resources

Developers

Help

Guidance for Machine Learning for Near Real-Time Advertising on AWS

Overview

How it works

Well-Architected Pillars

Operational Excellence

Security

Reliability

Performance Efficiency

Cost Optimization

Sustainability

Implementation resources

Related content

Disclaimer

Did you find what you were looking for today?

Learn

Resources

Developers

Help