AWS Machine Learning Blog
Build your gen AI–based text-to-SQL application using RAG, powered by Amazon Bedrock (Claude 3 Sonnet and Amazon Titan for embedding)
SQL is one of the key languages widely used across businesses, and it requires an understanding of databases and table metadata. This can be overwhelming for nontechnical users who lack proficiency in SQL. Today, generative AI can help bridge this knowledge gap for nontechnical users to generate SQL queries by using a text-to-SQL application. This application allows users to ask questions in natural language and then generates a SQL query for the user’s request.
Large language models (LLMs) are trained to generate accurate SQL queries for natural language instructions. However, off-the-shelf LLMs can’t be used without some modification. Firstly, LLMs don’t have access to enterprise databases, and the models need to be customized to understand the specific database of an enterprise. Additionally, the complexity increases due to the presence of synonyms for columns and internal metrics available.
The limitation of LLMs in understanding enterprise datasets and human context can be addressed using Retrieval Augmented Generation (RAG). In this post, we explore using Amazon Bedrock to create a text-to-SQL application using RAG. We use Anthropic’s Claude 3.5 Sonnet model to generate SQL queries, Amazon Titan in Amazon Bedrock for text embedding and Amazon Bedrock to access these models.
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.
Solution overview
This solution is primarily based on the following services:
- Foundational model – We use Anthropic’s Claude 3.5 Sonnet on Amazon Bedrock as our LLM to generate SQL queries for user inputs.
- Vector embeddings – We use Amazon Titan Text Embeddings v2 on Amazon Bedrock for embeddings. Embedding is the process by which text, images, and audio are given numerical representation in a vector space. Embedding is usually performed by a machine learning (ML) model. The following diagram provides more details about embeddings.
- RAG – We use RAG for providing more context about table schema, column synonyms, and sample queries to the FM. RAG is a framework for building generative AI applications that can make use of enterprise data sources and vector databases to overcome knowledge limitations. RAG works by using a retriever module to find relevant information from an external data store in response to a user’s prompt. This retrieved data is used as context, combined with the original prompt, to create an expanded prompt that is passed to the LLM. The language model then generates a SQL query that incorporates the enterprise knowledge. The following diagram illustrates the RAG framework.
- Streamlit – This open source Python library makes it straightforward to create and share beautiful, custom web apps for ML and data science. In just a few minutes you can build powerful data apps using only Python.
The following diagram shows the solution architecture.
We need to update the LLMs with an enterprise-specific database. This make sure that the model can correctly understand the database and generate a response tailored to enterprise-based data schema and tables. There are multiple file formats available for storing this information, such as JSON, PDF, TXT, and YAML. In our case, we created JSON files to store table schema, table descriptions, columns with synonyms, and sample queries. JSON’s inherently structured format allows for clear and organized representation of complex data such as table schemas, column definitions, synonyms, and sample queries. This structure facilitates quick parsing and manipulation of data in most programming languages, reducing the need for custom parsing logic.
There can be multiple tables with similar information, which can lower the model’s accuracy. To increase the accuracy, we categorized the tables in four different types based on the schema and created four JSON files to store different tables. We’ve added one dropdown menu with four choices. Each choice represents one of these four categories and is lined to individual JSON files. After the user selects the value from the dropdown menu, the relevant JSON file is passed to Amazon Titan Text Embeddings v2, which can convert text into embeddings. These embeddings are stored in a vector database for faster retrieval.
We added the prompt template to the FM to define the roles and responsibilities of the model. You can add additional information such as which SQL engine should be used to generate the SQL queries.
When the user provides the input through the chat prompt, we use similarity search to find the relevant table metadata from the vector database for the user’s query. The user input is combined with relevant table metadata and the prompt template, which is passed to the FM as a single input all together. The FM generates the SQL query based on the final input.
To evaluate the model’s accuracy and track the mechanism, we store every user input and output in Amazon Simple Storage Service (Amazon S3).
Prerequisites
To create this solution, complete the following prerequisites:
- Sign up for an AWS account if you don’t already have one.
- Enable model access for Amazon Titan Text Embeddings v2 and Anthropic’s Claude 3.5 Sonnet on Amazon Bedrock.
- Create an S3 bucket as ‘simplesql-logs-****‘, replace ‘****’ with your unique identifier. Bucket names are unique globally across the entire Amazon S3 service.
- Choose your testing environment. We recommend that you test in Amazon SageMaker Studio, although you can use other available local environments.
- Install the following libraries to execute the code:
Procedure
There are three main components in this solution:
- JSON files store the table schema and configure the LLM
- Vector indexing using Amazon Bedrock
- Streamlit for the front-end UI
You can download all three components and code snippets provided in the following section.
Generate the table schema
We use the JSON format to store the table schema. To provide more inputs to the model, we added a table name and its description, columns and their synonyms, and sample queries in our JSON files. Create a JSON file as Table_Schema_A.json by copying the following code into it:
Configure the LLM and initialize vector indexing using Amazon Bedrock
Create a Python file as library.py by following these steps:
- Add the following import statements to add the necessary libraries:
- Initialize the Amazon Bedrock client and configure Anthropic’s Claude 3.5 You can limit the number of output tokens to optimize the cost:
- Create and return an index for the given schema type. This approach is an efficient way to filter tables and provide relevant input to the model:
- Use the following function to create and return memory for the chat session:
- Use the following prompt template to generate SQL queries based on user input:
- Use the following function to get a response from the RAG chat model:
Configure Streamlit for the front-end UI
Create the file app.py by following these steps:
- Import the necessary libraries:
- Initialize the S3 client:
- Configure Streamlit for UI:
- Generate the query:
- Use the following for SQL generation:
- Log the conversations to the S3 bucket:
Test the solution
Open your terminal and invoke the following command to run the Streamlit application.
streamlit run app.py
To visit the application using your browser, navigate to the localhost.
To visit the application using SageMaker, copy your notebook URL and replace ‘default/lab’ in the URL with ‘default/proxy/8501/ ‘ . It should look something like the following:
Choose Generate SQL query to open the chat window. Test your application by asking questions in natural language. We tested the application with the following questions and it generated accurate SQL queries.
Count of orders placed from India last month?
Write a query to extract the canceled order count for the items that were listed this year.
Write a query to extract the top 10 item names having highest order for each country.
Troubleshooting tips
Use the following solutions to address errors:
Error – An error raised by inference endpoint means that an error occurred (AccessDeniedException) when calling the InvokeModel operation. You don’t have access to the model with the specified model ID.
Solution – Make sure you have access to the FMs in Amazon Bedrock, Amazon Titan Text Embeddings v2, and Anthropic’s Claude 3.5 Sonnet.
Error – app.py does not exist
Solution – Make sure your JSON file and Python files are in the same folder and you’re invoking the command in the same folder.
Error – No module named streamlit
Solution – Open the terminal and install the streamlit module by running the command pip install streamlit
Error – An error occurred (NoSuchBucket) when calling the GetObject operation. The specified bucket doesn’t exist.
Solution – Verify your bucket name in the app.py file and update the name based on your S3 bucket name.
Clean up
Clean up the resources you created to avoid incurring charges. To clean up your S3 bucket, refer to Emptying a bucket.
Conclusion
In this post, we showed how Amazon Bedrock can be used to create a text-to-SQL application based on enterprise-specific datasets. We used Amazon S3 to store the outputs generated by the model for corresponding inputs. These logs can be used to test the accuracy and enhance the context by providing more details in the knowledge base. With the aid of a tool like this, you can create automated solutions that are accessible to nontechnical users, empowering them to interact with data more efficiently.
Ready to get started with Amazon Bedrock? Start learning with these interactive workshops.
For more information on SQL generation, refer to these posts:
- Build a robust text-to-SQL solution generating complex queries, self-correcting, and querying diverse data sources
- Imperva optimizes SQL generation from natural language using Amazon Bedrock
We recently launched a managed NL2SQL module to retrieve structured data in Amazon Bedrock Knowledge . To learn more, visit Amazon Bedrock Knowledge Bases now supports structured data retrieval.
About the Author
Rajendra Choudhary is a Sr. Business Analyst at Amazon. With 7 years of experience in developing data solutions, he possesses profound expertise in data visualization, data modeling, and data engineering. He is passionate about supporting customers by leveraging generative AI–based solutions. Outside of work, Rajendra is an avid foodie and music enthusiast, and he enjoys swimming and hiking.