[text]
This Guidance shows how to calibrate and deploy a Stable Diffusion model to generate personalized avatars with a simple text prompt. Stable Diffusion is a text-to-image model, generated by a type of artificial intelligence (AI) that leverages the latest advances in machine learning. Here, the models are built by Amazon SageMaker and calibrated with the DreamBooth approach, which uses 10-15 images of the user to capture the precise details of the subject. The model generates a personalized avatar that can be used in a variety of applications, including social media, gaming, and virtual events. The Guidance also includes a text prompt feature that allows users to generate avatars based on specific text inputs. This feature expands the capabilities of the applications and provides media and entertainment organizations more ways to develop personalized content, tailored to the consumer.
This Guidance provides an AI-based approach for helping media and entertainment organizations develop personalized, tailored content at scale. However, users of this Guidance should take precautions to ensure these AI capabilities are not abused or manipulated. Visit Safe image generation and diffusion models with Amazon AI content moderation services to learn about safeguarding content through a proper moderation mechanism.
Please note: [Disclaimer]
Architecture Diagram
[text]
Step 1
Initiate training through a call to an Amazon API Gateway RESTful API endpoint using AWS Identity and Access Management (IAM) authentication.
Step 2
An AWS Lambda function packages user images and training configuration files, and uploads them to an Amazon Simple Storage Service (Amazon S3) bucket. It then invokes the training job.
Step 3
An Amazon SageMaker asynchronous inference manages the training process. Training jobs are automatically queued before going through image preparation, calibration, and post-processing steps.
Step 4
SageMaker publishes job status through Amazon Simple Notification Service (Amazon SNS) topics.
Step 5
User application subscribes to Amazon Simple Queue Service (Amazon SQS) for update when a training job is completed.
Step 6
Model artifacts are uploaded to Amazon S3 model hosting bucket.
Step 7
Initiate inference through a call to an API Gateway RESTful API endpoint using IAM authentication.
Step 8
Lambda function invokes the model endpoint.
Step 9
SageMaker Multi-Model Endpoints (MME) provide inference from dynamically loaded and cached personalized models from the Amazon S3 model hosting bucket, based on the traffic pattern to each model.
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
SageMaker multi-model endpoints and Amazon CloudWatch are utilized throughout this Guidance and designed to enhance your operational excellence. First, SageMaker multi-model endpoints allow you to deploy a multitude of models behind a single endpoint, reducing the number of endpoints you need to manage. SageMaker manages loading and caching models based on your traffic patterns. You can add or update the model without redeploying the endpoint. Simply upload the models to the SageMaker managed Amazon S3 location. Additionally, SageMaker automatically integrates with CloudWatch where you can track metrics, events, and log files from the model and gain insights into the performance of your models. You can also set up alarms and proactively monitor issues before they impact the customer experience.
-
Security
API Gateway provides built-in mechanisms to authenticate and authorize API requests, preventing denial-of-service attacks, or other types of abuse that can overload your backend resources. You can also use Amazon Cognito user pools, OAuth 2.0, or IAM roles to control access to your APIs. And to protect data, API Gateway ensures data coming to your endpoint is SSL/TLS encrypted. It also supports API throttling, helping to protect your APIs from excessive traffic or abuse. Also, consider adding AWS WAF, a web application firewall, in front of the API Gateway to protect applications from web-based attacks and exploits. Finally, consider AWS Shield to protect your workloads from distributed denial of service (DDoS) attacks.
-
Reliability
API Gateway, Lambda, and SageMaker are deployed throughout this Guidance to enhance the reliability of your workloads. First, API Gateway provides built-in fault tolerance and automatic scaling to handle spikes in traffic. It also integrates with Lambda and SageMaker to make it easy for you to build scalable, serverless APIs. Moreover, SageMaker is designed to provide high reliability and availability for running machine learning workloads and serving machine learning models. It provides managed automatic scaling, fault tolerance, health checks, monitoring, and diagnostics. It runs on a distributed infrastructure spread across multiple availability zones, ensuring high availability. These ensure reliability for your model training and inferences.
-
Performance Efficiency
SageMaker is used here to enhance performance efficiency, providing a high-performance, low-latency inference service that can be used to host machine learning models. You can easily configure instance type, count, and other deployment configurations to right-size your inference workload, optimizing for latency, throughput, and cost.
-
Cost Optimization
SageMaker multi-model endpoints provide a scalable and cost-effective way to deploy large numbers of models. These endpoints use the same container to host all of your models, allowing you to reduce the overhead of managing separate endpoints. In a situation when some of the models are not utilized as much, you have the resource sharing in place to maximize infrastructure utilization and save costs when compared to having separate endpoints.
-
Sustainability
SageMaker Asynchronous Inference is a capability that queues incoming requests and processes those requests asynchronously. Meaning, SageMaker can autoscale down to zero instances when not used, saving compute resources when idling and helping to minimize the environmental impacts of running your cloud workloads.
Implementation Resources
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content
Safe image generation and diffusion models with Amazon AI content moderation services
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.