Sign in

Sign in

or

Create a new account

English

Español

Français

日本語

한국어

About

Categories

Delivery Methods

Resources

Your Saved List

Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

AWS Marketplace

Build with popular tools

Simplify your RAG implementation using foundational models

Learn how to deploy production-ready vector storage capabilities, using tried-and-tested components already available in the tech stack of most organizations

Featured in this article

Redis Cloud

Try free in AWS Marketplace

Amazon Bedrock

Amazon Bedrock

Build and scale with foundation models

New capabilities without added complexity

In this article we’re going to look at machine learning (ML) as a new source of system complexity and cognitive load to developers, and how those challenges can be solved by using familiar and reliable components likely already present in most enterprise IT organizations.

We will focus on the core components of a very common ML use case: large language models (LLMs) enriched using the Retrieval-Augmented Generation (RAG) technique. Then we’ll see how the requirements that drive selecting infrastructure components to satisfy these new use cases can be accomplished without introducing new tools, more complexity, and additional cognitive load to developers, all while satisfying the expectations of a production-ready system running fully in the cloud.

Simplify your RAG implementation on AWS

ML and growing infrastructure complexity

Complexity has been a consistent and mounting concern for most IT organizations: complex systems are harder to operate, more costly to maintain, and more prone to failure.

Nevertheless, as new patterns for designing systems are adopted, prioritizing development velocity and leading-edge capabilities over complexity has usually been the norm.

The cloud has helped by eliminating a good percentage of the undifferentiated heavy lifting required to manage some of the many pieces of infrastructure required by the majority of modern software systems. But the problem of cognitive load to developers and operations teams that now have to work with a growing number of tools and learn a seemingly endless array of technologies has not been solved.

ML is now adding yet another layer of complexity to the environments that developers and infrastructure engineers have to support, introducing the need for specialized computing capabilities, as well as a new set of requirements related to data processing and storage.

We will drill down on some of the challenges related to tooling and infrastructure for a specific type of data: vectors.

What’s the deal with vector storage?

Vectors are numerical representations of any other type of data—generating, storing, and querying them are key in the implementation of applications built around the technique known as RAG.

RAG has turned into arguably the fastest and simplest way to provide custom responses using proprietary data, without the need to directly manipulate an LLM, whether by fine tuning, training, or other mechanisms that are dramatically more expensive and time consuming.

Vectors are the data type used for storing custom knowledge that can be queried and integrated into the responses generated by LLMs, providing more accurate responses, thanks to this additional context.

Representing data as vectors requires (at least) the following two components:

1. An embedding model
2. A database capable of storing, indexing, and querying vector data

Unstructured/Structured Data to Redis Vector Database

That means that most teams today are having to figure out how to add these new moving pieces to their infrastructure while satisfying the stringent demands of production grade systems, which in turn represents more surface to support, more services to maintain, and more tooling to learn.

My personal journey with vectors and LLMs

As I was starting to play around with RAG and LLMs—as most of my peers were—the need for vector storage became evident, which drew me into the rabbit hole of finding the right new specialized database to add to my stack. The database needed to work effortlessly with the groundbreaking LLM concept I was building (Spoiler: it wasn’t as groundbreaking as I first thought).

Soon I realized that when using frameworks like LlamaIndex or LangChain, integrating with pretty much any vector database was pretty trivial; usually a handful of lines of code needed changing for me to be able to connect to yet another one of the vector databases I was trying out.

And as soon as I realized that it also dawned on me that I was spending an inordinate amount of time figuring out how to monitor, operate, query and, in general, interact with every new database I was trying out, compared to the time it actually took me to write the code to get my prototype off the ground.

This is when I concluded that maybe I didn’t have to look away from my existing stack to solve that problem. And that’s when I learned that Redis was actually able to work with vector data!

Old dog, new tricks

I have been a Redis user for more years than I can remember, which means I’m very familiar with its query syntax. I have a solid development environment already set up. I even have some handy Visual Studio Code plugins in place to help me quickly look at collections and query data.

Redis Query Data

But I must admit, the role of Redis pretty much anywhere I’ve used it throughout the years has been as key-value store, usually working as cache, sometimes as queue, and in general where data was ephemeral, and query response time was the key attribute. And, needless to say, it has served that purpose remarkably well, and not just for me but for many other enterprise architects that make the same choice for their own architectures.

It came as a surprise to me that Redis, my trusty old friend, was able to handle vector data and do it with the same remarkable efficiency and reliability I knew for my other use cases. And I could interact with this wholly new type of data with the same tooling I was already very accustomed to and without having to learn a new syntax, install new libraries, or do anything else. I was basically ready to go.

And then there was Amazon Bedrock

So Redis came to solve one of my problems: storing vectors without having to worry about increasing the complexity of my tech stack. But what about generating embeddings and integrating my vectorized data as context for my LLM to use?

In my mind, that would also require new pieces of machinery—something that pulled the data from Amazon Simple Storage Service (Amazon S3), where I had it stored, compute to run the embedding model, and the whole glue that would tie that together in a way that would allow me to keep my data vectors up to date with changing data in Amazon S3.

And this is where Amazon Bedrock, with its incredibly powerful agents and the concept of Knowledge Bases for Amazon Bedrock, became the solution to my complexity problem.

Knowledge for your LLM with a handful of clicks

Let’s dive into the architecture of the overall solution first:

Redis Cloud on AWS Reference Architecture

I’m still using LangChain in my containerized application running in Amazon Elastic Kubernetes Service (Amazon EKS), since I need to establish a WebSocket connection as well as handle the business logic and requests between the user and the ML model.

LangChain: Prompt input to generated output

The magic starts to happen when you look further to the right.

First off, I’m using Guardrails for Amazon Bedrock. Any production-ready LLM implementation must have a way to provide safety and security to users, which means everything from filtering harmful and undesirable topics to protecting personal and sensitive information. All these capabilities are available right off the shelf with Guardrails for Amazon Bedrock, which I was able to apply to any foundational LLM I chose to use.

Amazon Bedrock provides instant access to most of the more relevant foundational models (FMs) available and allows incredibly quick yet powerful integrations using Agents for Amazon Bedrock and Knowledge Bases for Amazon Bedrock.

With these two capabilities I was able to fully automate the process of consuming data from Amazon S3, connect it to my trusty Redis, and have my model of choice use that data for RAG context!

And I was able to do all this without adding a single additional component to my tech stack outside of Amazon Bedrock, which I already needed anyway to run the model itself.

Redis as vector storage

As I mentioned before, getting Redis as vector storage up and running was just as simple as using Redis for any other use case. That becomes even simpler when running Redis Cloud, available in AWS Marketplace, since it will set up all the stuff you need for Redis in your AWS account and VPC.

Getting started with Redis Cloud on AWS Marketplace: select cloud vendor.

After setting up your Redis Cloud account, you can then go to Amazon Bedrock and configure your new Knowledge Bases for Amazon Bedrock. Select Redis Cloud in the vector database section, and then use Knowledge Bases for Amazon Bedrock to connect it to your LLM using Agents for Amazon Bedrock.

Production-ready performance

The performance you’ve grown accustomed to expect from Redis for more “traditional” use cases, which is usually one of the deciding factors in picking the tool, is consistent with the performance you can expect when using it for its vector search capabilities.

Of course, this is all for scenarios where the dataset can fit in-memory, which is a key characteristic that makes Redis really stand out.

The bottom line

What do you get with this setup? A fully managed set of infrastructure components for all the moving pieces of your RAG implementation using FMs, all while leveraging a tool that you are already familiar with and is likely present in the stack your organization already uses. And one that you are fully setup for effective development.

Getting production-ready LLM deployments using the RAG technique could not be easier, and using Redis Cloud, which you can acquire in AWS Marketplace, makes the process even smoother.

Be on the lookout for our upcoming lab where I’ll show you, step by step, how to actually build this exact solution!

Get hands on

Free Trial

Redis Cloud

Try this use case for yourself with a free trial of Redis Cloud in AWS Marketplace.

Resources

AWS Marketplace for Builders

Explore technical guidance to help with your AWS use case, find the right tool for the job, and use free trials to accelerate your proof-of-concept.

About AWS Marketplace

AWS Marketplace makes it easy to find and add new tools from across the AWS partner community to your tech stack with the ability to try for free and pay-as-you-go using your AWS account.

AWS Marketplace Free Trials

Easily add new category-leading third-party solution capabilities into your AWS environment.

AWS Marketplace Tech Stack

Avoid up front license fees and pay only for what you use, consolidating billing with your AWS account.

AWS Marketplace Cloud Infrastructure

Boost performance, security, scalability, and enable new user experiences.

Explore AWS Marketplace

Webpage

Discover benefits

Review the features and advantages of using AWS Marketplace

Webpage

Find software & solutions

Browse by industry or solution category

Webpage

Browse all products

Filter through all products available in AWS Marketplace

Webpage

Explore resources

Find webinars, whitepapers, implementation guides, and analyst reports

Webpage

Speak to an expert

Speak with an AWS Marketplace expert to help choose and integrate software