Guidance for Generative AI Assistant on AWS

Go to sample code

Overview

This Guidance shows how to unlock instant insights with a generative AI assistant that transforms content consumption across diverse sources, including web documents, PDFs, media files, and YouTube videos. Using
Amazon Bedrock large language models (LLMs) and other AWS services, you can upload documents or share URLs and then receive instant, comprehensive summaries without sifting through extensive content. The interactive chat interface enables real-time conversations with the AI assistant, allowing you to ask questions and explore topics in depth. Every interaction and chat session remains saved for future reference, enhancing your productivity through efficient information management.

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Download the architecture diagram

Deploy with confidence

Ready to deploy? Review the sample code on GitHub for detailed deployment instructions to deploy as-is or customize to fit your needs.

Go to sample code

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Amplify streamlines the development and deployment workflow by providing a robust continuous integration, continuous deployment (CI/CD) pipeline to help ensure consistent and automated deployments. Amazon Bedrock enables straightforward integration with multiple foundation models. For example, this Guidance implements Anthropic's LLM through Amazon Bedrock to process input documents and web URLs, generating summaries and enabling chat functionality. Additionally, DynamoDB provides auto-scaling capabilities that eliminate the operational overhead of database management.

Read the Operational Excellence whitepaper

Amplify provides secure hosting with built-in SSL/TLS encryption, while implementing identity management through Amazon Cognito for robust authentication and authorization. With Amazon Bedrock, you gain full control over the data you use to customize the foundation models for your generative AI applications. This service encrypts all data both in transit and at rest, while helping ensure your data remains isolated and protected. When you fine-tune foundation models, Amazon Bedrock creates a private copy of that model, preventing data sharing with model providers or base model improvements. Through AWS Identity and Access Management (IAM), you can maintain precise control over resource access, managing user permissions and sign-in capabilities for related resources. Additionally, DynamoDB helps ensure data security through encryption at rest.

Read the Security whitepaper

Amazon CloudWatch monitors all services configured in this Guidance, collecting metrics and presenting them through intuitive dashboards that offer visibility into application health and operational status. CloudWatch provides advanced analysis capabilities to simplify the debugging process for distributed systems, providing detailed insights into application performance and underlying service behavior.

Read the Reliability whitepaper

Amplify delivers a high-performance frontend experience through global content delivery network (CDN) distribution and automatic performance optimization. Amazon Bedrock streamlines access to high-performing foundation models from industry leaders including AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a unified API interface. The Claude 3 Sonnet model delivers optimal balance between intelligence and processing speed, supporting an extensive context window of 200,000 text tokens and maintaining a maximum token limit of 4,096.

Read the Performance Efficiency whitepaper

Amplify uses a pay-as-you-go model, helping to align costs with actual usage. Amplify also provides built-in hosting optimizations to reduce bandwidth costs. Amazon Bedrock charges for model inference and customization operations. The service offers two distinct pricing plans for inference. The first plan, On-Demand and Batch, enables foundation model usage on a pay-as-you-go basis without time-based commitments. The second plan, Provisioned Throughput, lets you provision specific throughput levels to meet application performance needs in exchange for time-based commitments.

Read the Cost Optimization whitepaper

The unified API approach of Amazon Bedrock enables efficient access to high-performing foundation models, allowing you to select optimal models for your sustainability initiatives. This consolidated access point streamlines resource utilization while maintaining performance standards for sustainability-focused applications. This Guidance also uses Amplify automated scaling to help ensure compute resources are used only when needed, reducing idle capacity. DynamoDB has an on-demand capacity mode that scales resources to match actual usage patterns.

Read the Sustainability whitepaper

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages

Guidance for Generative AI Assistant on AWS

Overview

How it works

Deploy with confidence

Well-Architected Pillars

Disclaimer

Did you find what you were looking for today?

Learn

Resources

Developers

Help