Skip to main content

Guidance for Chatbots with Vector Databases on AWS

Overview

This Guidance provides a step-by-step guide for creating a retrieval-augmented generation (RAG) application, such as a question-answering bot. By using a combination of AWS services, open-source foundation models, and packages such as LangChain and Streamlit, you can create an enterprise-ready application. The RAG-based approach uses a similarity search to provide context to users' inquiries, thereby enhancing the precision and sufficiency of the responses provided.

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

This Guidance enhances operational excellence by automating tasks and providing capabilities that reduce manual efforts, enhance system reliability, and bolster security. AWS services like SageMaker, API Gateway, Lambda, and OpenSearch Service are fully managed, removing the need for your development team to handle server provisioning, patching, and routine maintenance. Additionally, they automate aspects like model deployment, code implementation, scaling, and failover, reducing the likelihood of human errors and accelerating response times during operational events.

Read the Operational Excellence whitepaper 

This Guidance prioritizes security, protecting user data and interactions and building trust among users.  Services like SageMaker, API Gateway, and OpenSearch Service scramble data, making it unreadable to unauthorized users. API Gateway, Lambda, and AWS Identity and Access Management (IAM) give you precise control over who can access the system and what they can do, and API Gateway and OpenSearch Service provide authentication to prevent unauthorized entry and avoid potential security issues.

Read the Security whitepaper 

This Guidance uses services with high reliability so that your system stays available and trustworthy for users. AWS services like SageMaker, Lambda, and OpenSearch Service are highly available, scale automatically to handle more users without slowing down, and use built-in backup plans to protect your data from loss or damage. Additionally, services like API Gateway and Lambda handle errors smoothly so that your users won’t notice interruptions.

Read the Reliability whitepaper 

This Guidance uses services that automate tasks, like setting up models, handling requests, and adjusting to changes. This makes your system faster and more efficient without requiring lots of manual work. SageMaker automates machine learning (ML) model deployment, improving overall application responsiveness. API Gateway efficiently manages incoming requests, minimizing response times. Lambda functions automatically scale to handle varying workloads, and OpenSearch Service provides fast and accurate document retrieval, making the process of finding similar documents quick and responsive.

Read the Performance Efficiency whitepaper 

This Guidance supports cost optimization by minimizing idle resource usage, adopting efficient pricing models, reducing maintenance overhead, and optimizing data handling, ultimately leading to lower operational costs. For example, SageMaker, API Gateway, and Lambda automatically scale and allocate resources based on demand. Managed services like SageMaker and OpenSearch Service also reduce the operational burden on your development team, lowering the costs of infrastructure management and maintenance. Additionally, Lambda provides a pay-as-you-go pricing model so that you’re only charged when functions are actively processing requests, and API Gateway efficiently handles requests and responses, reducing the amount of data sent over the network.

Read the Cost Optimization whitepaper 

This Guidance uses services that support sustainability through automatic scalability. Serverless services such as Lambda and API Gateway use compute resources only when invoked, and OpenSearch Service and SageMaker automatically scale to match your workload’s demands. By promoting efficient resource usage, this Guidance helps you avoid unnecessary energy consumption and reduce your carbon footprint.

Read the Sustainability whitepaper 

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.