AWS Database Blog

Category: Technical How-to

Heterogeneous data sources: Access your data in PostgreSQL from Amazon RDS for Oracle using Oracle Database Gateway

In certain customer scenarios, Amazon RDS for Oracle databases need to connect to external data sources, such as RDS for PostgreSQL. PostgreSQL can establish connections to Oracle databases using a foreign data wrapper (FDW). In this post, we walk you through setting up an EC2 instance as a database gateway server. You will install and configure Oracle Database Gateway for ODBC (DG4ODBC), ODBC drivers, a PostgreSQL client, and PostgreSQL libraries. With this setup, you can create database links on RDS for Oracle to connect to PostgreSQL through this gateway.

Capture and diagnose I/O bottlenecks on Amazon RDS for SQL Server

In our previous post, Capture and tune resource utilization metrics for Amazon RDS for SQL Server,’ we demonstrated how to use Amazon RDS Enhanced Monitoring and Amazon RDS Performance Insights to diagnose and debug CPU utilization bottlenecks for Amazon Relational Database Service (Amazon RDS) for SQL Server. Aside from CPU and memory, I/O performance is critical for overall database performance. It’s important to understand the I/O requirements of a SQL Server workload, which is dependent on various factors like query access patterns, database schema, and state of database maintenance. Understanding your workload’s, I/O patterns can guide you in selecting the optimal storage type for your RDS instance, balancing performance needs with cost-effectiveness. In this post, we demonstrate how you can use Amazon RDS monitoring tools along with SQL Server monitoring capabilities to capture, diagnose, and resolve I/O issues on an RDS for SQL Server instance.

Tune Amazon RDS for Oracle CDBs with Amazon Performance Insights

With Oracle Multitenant, you can consolidate standalone databases by either creating them as PDBs or migrating them to PDBs. Performance Insights has introduced a new PDB dimension to help you visualize and analyze the distribution of the load on individual PDBs within the CDB on a RDS for Oracle instance. Now, you can slice the database load metric by the PDB and SQL dimensions to identify the top queries running on each of the PDBs. In this post, we will discuss how to identify resource-intensive SQL queries at a PDB level on a visual dashboard in Performance Insights.

Optimize Amazon Aurora PostgreSQL auto scaling performance with automated cache pre-warming

When clients start running queries on new Amazon Aurora replicas, they will notice a longer runtime for the first few times that queries are run; this is due to the cold cache of the replica. As the database runs more queries, the cache gets populated and the clients notice faster runtimes. In this post, we focus on how to address the cold cache so clients that are connecting through a load-balanced endpoint get a consistent experience regardless of whether the replicas are automatically or manually scaled. In addition, we also look at other caching solutions such as Amazon ElastiCache, a fully managed Memcached, Redis, and Valkey compatible service, that can further improve the overall experience for latency-sensitive applications and, in some situations (such as higher cache hits), lead to less frequent auto-scaling events of the Aurora read replicas.

Amazon DynamoDB data models for generative AI chatbots

Amazon DynamoDB is ideal for storing chat history and metadata due to its scalability and low latency. DynamoDB can efficiently store chat history, allowing quick access to past interactions. User-specific metadata, such as preferences and session information, can be stored to personalize responses and manage active sessions, enhancing the overall chatbot experience.In this post, we explore how to design an optimal schema for chatbots, whether you’re building a small proof of concept application or deploying a large-scale production system.

Use a DAO to govern LLM training data, Part 4: MetaMask authentication

In Part 1 of this series, we introduced the concept of using a decentralized autonomous organization (DAO) to govern the lifecycle of an AI model, focusing on the ingestion of training data. In Part 2, we created and deployed a minimalistic smart contract on the Ethereum Sepolia using Remix and MetaMask, establishing a mechanism to govern which training data can be uploaded to the knowledge base and by whom. In Part 3, we set up Amazon API Gateway and deployed AWS Lambda functions to copy data from InterPlanetary File System (IPFS) to Amazon Simple Storage Service (Amazon S3) and start a knowledge base ingestion job, creating a seamless data flow from IPFS to the knowledge base. In this post, we demonstrate how to configure MetaMask authentication, create a frontend interface, and test the solution.

Use a DAO to govern LLM training data, Part 3: From IPFS to the knowledge base

In Part 1 of this series, we introduced the concept of using a decentralized autonomous organization (DAO) to govern the lifecycle of an AI model, focusing on the ingestion of training data. In Part 2, we created and deployed a minimalistic smart contract on the Ethereum Sepolia testnet using Remix and MetaMask, establishing a mechanism to govern which training data can be uploaded to the knowledge base and by whom. In this post, we set up Amazon API Gateway and deploy AWS Lambda functions to copy data from InterPlanetary File System (IPFS) to Amazon Simple Storage Service (Amazon S3) and start a knowledge base ingestion job.

Use a DAO to govern LLM training data, Part 2: The smart contract

In Part 1 of this series, we introduced the concept of using a decentralized autonomous organization (DAO) to govern the lifecycle of an AI model, specifically focusing on the ingestion of training data. In this post, we focus on the writing and deployment of the Ethereum smart contract that contains the outcome of the DAO decisions.

Use a DAO to govern LLM training data, Part 1: Retrieval Augmented Generation

Blockchain and generative AI are two technical fields that have received a lot of attention in the recent years. There is an emerging set of use cases that can benefit from these two technologies. In this four-part series, we build a solution that governs the training data ingestion process of an AI model, using a smart contract and serverless components. We guide you through the different steps to build the solution. In this post, we review the overall architecture of the solution, and set up a large language model (LLM) knowledge base.