This Guidance provides a step-by-step guide for creating a retrieval-augmented generation (RAG) application, such as a question-answering bot. By using a combination of AWS services, open-source foundation models, and packages such as LangChain and Streamlit, you can create an enterprise-ready application. The RAG-based approach uses a similarity search to provide context to users' inquiries, thereby enhancing the precision and sufficiency of the responses provided.

Please note: [Disclaimer]

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF 

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

  • This Guidance enhances operational excellence by automating tasks and providing capabilities that reduce manual efforts, enhance system reliability, and bolster security. AWS services like SageMaker, API Gateway, Lambda, and OpenSearch Service are fully managed, removing the need for your development team to handle server provisioning, patching, and routine maintenance. Additionally, they automate aspects like model deployment, code implementation, scaling, and failover, reducing the likelihood of human errors and accelerating response times during operational events.

    Read the Operational Excellence whitepaper 
  • This Guidance prioritizes security, protecting user data and interactions and building trust among users.  Services like SageMaker, API Gateway, and OpenSearch Service scramble data, making it unreadable to unauthorized users. API Gateway, Lambda, and AWS Identity and Access Management (IAM) give you precise control over who can access the system and what they can do, and API Gateway and OpenSearch Service provide authentication to prevent unauthorized entry and avoid potential security issues.

    Read the Security whitepaper 
  • This Guidance uses services with high reliability so that your system stays available and trustworthy for users. AWS services like SageMaker, Lambda, and OpenSearch Service are highly available, scale automatically to handle more users without slowing down, and use built-in backup plans to protect your data from loss or damage. Additionally, services like API Gateway and Lambda handle errors smoothly so that your users won’t notice interruptions.

    Read the Reliability whitepaper 
  • This Guidance uses services that automate tasks, like setting up models, handling requests, and adjusting to changes. This makes your system faster and more efficient without requiring lots of manual work. SageMaker automates machine learning (ML) model deployment, improving overall application responsiveness. API Gateway efficiently manages incoming requests, minimizing response times. Lambda functions automatically scale to handle varying workloads, and OpenSearch Service provides fast and accurate document retrieval, making the process of finding similar documents quick and responsive.

    Read the Performance Efficiency whitepaper 
  • This Guidance supports cost optimization by minimizing idle resource usage, adopting efficient pricing models, reducing maintenance overhead, and optimizing data handling, ultimately leading to lower operational costs. For example, SageMaker, API Gateway, and Lambda automatically scale and allocate resources based on demand. Managed services like SageMaker and OpenSearch Service also reduce the operational burden on your development team, lowering the costs of infrastructure management and maintenance. Additionally, Lambda provides a pay-as-you-go pricing model so that you’re only charged when functions are actively processing requests, and API Gateway efficiently handles requests and responses, reducing the amount of data sent over the network.

    Read the Cost Optimization whitepaper 
  • This Guidance uses services that support sustainability through automatic scalability. Serverless services such as Lambda and API Gateway use compute resources only when invoked, and OpenSearch Service and SageMaker automatically scale to match your workload’s demands. By promoting efficient resource usage, this Guidance helps you avoid unnecessary energy consumption and reduce your carbon footprint.

    Read the Sustainability whitepaper 

Implementation Resources

A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

Machine Learning

Build a powerful question-answering bot with Amazon SageMaker, Amazon OpenSearch Service, Streamlit, and LangChain

This blog post provides a step-by-step guide with all the building blocks for creating an enterprise-ready RAG application such as a question-answering bot. 
Machine Learning
Sample Code

Large Language Model (LLM) and Retrieval Augmented Generation (RAG)

This sample code demonstrates a RAG based LLM powered question answer bot.


The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.

Was this page helpful?