Skip to main content

Guidance for Retrieving Data Using Natural Language Queries on AWS

Overview

This Guidance demonstrates how to efficiently retrieve data by using the agent-driven framework of Amazon Bedrock to convert natural language queries (NLQ) into SQL queries. The agent-driven approach allows the Amazon Bedrock Agents to interpret your natural language input, break down complex queries, and delegate specific actions to the appropriate large language models (LLMs) and services. The agents orchestrate the entire process in an automated and coordinated manner, eliminating the need for you to manually construct database queries. By using Amazon Bedrock Agents to handle the complex task of NLQ-to-SQL conversion, you can access and analyze data more efficiently and accurately.

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Deploy with confidence

Ready to deploy? Review the sample code on GitHub for detailed deployment instructions to deploy as-is or customize to fit your needs. 

Go to sample code

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Amazon S3 provides scalable and durable storage for source data, making it readily accessible and manageable. AWS Glue automates the data cataloging process by running crawlers and creating tables, streamlining the data integration workflow. Athena enables efficient querying of the data using standard SQL, and Lambda functions integrate with Athena to execute the SQL queries generated by the Amazon Bedrock Agent, facilitating real-time data processing. These services work together to automate data ingestion, cataloging, and querying, enabling quick responses to user queries and streamlining the management of data workflows.

Read the Operational Excellence whitepaper 

AWS Shield Standard can be integrated to provide protection against distributed denial-of-service (DDoS) attacks. Amazon Bedrock Guardrails offers additional safeguards you can customize. This feature adds another layer of safeguards regardless of the underlying foundation model (FM). It evaluates user inputs and FM responses based on specific policies to detect and block undesirable topics. Moreover, Amazon Bedrock Guardrails filters harmful content, redacts sensitive information, blocks inappropriate content with custom word filters, and detects hallucinations in model responses. AWS Identity and Access Management (IAM) limits unauthorized access by enforcing minimum permissions, and Amazon CloudFront secures data transmission and access so only authorized users can interact with the application.

Read the Security whitepaper 

The services used were designed to provide high availability, automated recovery, and consistent performance. Amazon S3 offers high availability and durability of the stored data. AWS Glue automates data processing tasks, reducing the risk of human error and enabling consistent data handling. Athena allows for reliable and efficient querying of the data. Lambda performs the execution of queries and other tasks without downtime. These services allow your system to quickly adapt to changing demands and recover from potential failures.

Read the Reliability whitepaper 

Amazon S3 and AWS Glue optimize data storage and processing for efficient data management. Athena provides quick and efficient data querying capabilities. The use of Lambda allows for the execution of tasks in a serverless environment, further optimizing resource usage. These services provide scalability, optimized resource usage, and the ability to experiment and optimize based on your user data and requirements.

Read the Performance Efficiency whitepaper 

The cost efficiencies are underpinned by the storage, processing, analytics, and compute capabilities of the services. Amazon S3 provides tiered storage options for cost-optimized data retention. AWS Glue automates data processing tasks, eliminating the need for manual interventions. Athena enables cost-efficient analysis of large datasets, without the overhead of dedicated database infrastructure. The use of Lambda functions contributes to cost optimization by providing a serverless execution environment.

Read the Cost Optimization whitepaper 

Amazon S3 supports sustainable storage practices by enabling lifecycle policies and tiered storage options, optimizing resource utilization. AWS Glue automates data processing tasks, reducing the need for continuous resource usage. Athena provides efficient data querying capabilities, minimizing the computational resources required. The Lambda functions offer a serverless execution model so that resources are only consumed when necessary. By optimizing resource utilization, reducing waste, and minimizing the environmental impact, these services help you build more sustainable workloads.

Read the Sustainability whitepaper 

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.