This Guidance helps game developers automate the process of creating a non-player character (NPC) for their games and associated infrastructure. It uses Unreal Engine MetaHuman, along with foundation models (FMs), for instance the large language models (LLMs) Claude 2, and Llama 2, to improve NPC conversational skills. This leads to dynamic responses from the NPC that are unique to each player, adding to scripted dialogue. By using the Large Language Model Ops (LLMOps) methodology, this Guidance accelerates prototyping, and delivery time by continually integrating, and deploying the generative AI application, along with fine-tuning the LLMs. All while helping to ensure that the NPC has full access to a secure knowledge base of game lore.

This Guidance includes four parts: an overview architecture, an LLMOps pipeline architecture, a Foundation Model Operations (FMOps) architecture, and a database hydration architecture.

Please note: [Disclaimer]

Architecture Diagram

Download the architecture diagram PDF 
  • Overview
  • This architecture diagram shows an overview workflow for hosting a generative AI NPC on AWS.

  • LLMOps Pipeline
  • This architecture diagram shows the processes of deploying an LLMOps pipeline on AWS.

  • FMOps Pipeline
  • This architecture diagram shows the process of tuning a generative AI model using FMOps.

  • Database Hydration
  • This architecture diagram shows the process for database hydration by vectorizing and storing gamer lore for RAG.

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

  • This Guidance uses AWS X-Ray, Lambda, API Gateway, and CloudWatch to track all API requests for generated NPC dialogue between the Unreal Engine Metahuman client and the Amazon Bedrock FM. This provides end-to-end visibility into the status of the Guidance, allowing you to granularly track each request and response from the game client so you can quickly identity issues and react accordingly. Additionally, this Guidance is codified as a CDK application using CodePipeline so operations teams and developers can address faults and bugs through appropriate change control methodologies and quickly deploy these updates or fixes using the CI/CD pipeline.

    Read the Operational Excellence whitepaper 
  • Amazon S3 provides encrypted protection for storing game lore documentation at rest in addition to encrypted access for data in transit, while ingesting game lore documentation into the vector or fine-tuning an Amazon Bedrock FM. API Gateway adds an additional layer of security between the Unreal Engine Metahuman and the Amazon Bedrock FM by providing TLS-based encryption of all data between the NPC and the model. Lastly, Amazon Bedrock implements automated abuse detection mechanisms to further identity and mitigate violations of the AWS Acceptable Use Policy and the AWS Responsible AI Policy.

    Read the Security whitepaper 
  • API Gateway manages the automated scaling and throttling of requests by the NPC to the FM. Additionally, since the entire infrastructure is codified using CI/CD pipelines, you can provision resources across multiple AWS accounts and multiple AWS Regions in parallel. This enables multiple simultaneous infrastructure re-deployment scenarios to help you overcome AWS Region-level failures. As serverless infrastructure resources, API Gateway and Lambda allow you to focus on game development instead of manually managing resource allocation and usage patterns for API requests.

    Read the Reliability whitepaper 
  • Serverless resources, such as Lambda and API Gateway, contribute to performance efficiency of the Guidance by providing both elasticity and scalability. This allows the Guidance to dynamically adapt to an increase or decrease in API calls from the NPC client. An elastic and scalable approach helps you right-size resources for optimal performance and to address unforeseen increases or decreases in API requests—without having to manually manage provisioned infrastructure resources.

    Read the Performance Efficiency whitepaper 
  • Codifying the Guidance as a CDK application provides game developers with the ability to quickly prototype and deploy their NPC characters into production. Developers get quick access to Amazon Bedrock FMs through an API Gateway REST API without having to engineer, build, and pre-train them. Turning around quick prototypes helps reduce the time and operations costs associated with building FMs from scratch.

    Read the Cost Optimization whitepaper 
  • Lambda provide a serverless, scalable, and event-driven approach without having to provision dedicated compute resources. Amazon S3 implements data lifecycle policies along with compression for all data across this Guidance, allowing for energy-efficient storage. Amazon Bedrock hosts FMs on AWS silicon, offering better performance per watt of standard compute resources.

    Read the Sustainability whitepaper 

Implementation Resources

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

[Content Type]


This [blog post/e-book/Guidance/sample code] demonstrates how [insert short description].


The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.

Was this page helpful?