Sign in

Sign in

or

Create a new account

Categories

What is AWS Marketplace Procurement Governance and Entitlement Cost Management How to Sell AWS Marketplace for Builders

Infrastructure Software Backup & Recovery Data Analytics High Performance Computing Migration Network Infrastructure Operating Systems Security Storage

DevOps Agile Lifecycle Management Application Development Application Servers Application Stacks Continuous Integration and Continuous Delivery Infrastructure as Code Issue & Bug Tracking Monitoring Log Analysis Source Control Testing

Business Applications Blockchain Collaboration & Productivity Contact Center Content Management CRM eCommerce eLearning Human Resources IT Business Management Project Management

Machine Learning Human Review Services ML Solutions Data Labeling Services Computer Vision Natural Language Processing Speech Recognition Text Image Video Audio Structured Intelligent Automation Generative AI

Data Products Financial Services Data Healthcare & Life Sciences Data Media & Entertainment Data Telecommunications Data Gaming Data Automotive Data Manufacturing Data Resources Data Retail, Location & Marketing Data Public Sector Data Environmental Data

IoT Analytics Applications Device Connectivity Device Management Device Security Industrial IoT Smart Home & City

Professional Services Assessments Implementation Managed Services Premium Support Training

Industries Education & Research Financial Services Healthcare & Life Sciences Media & Entertainment Industrial Energy Automotive

Cloud Operations Cloud Governance Cloud Financial Management

Delivery Methods Amazon Machine Image Amazon SageMaker AWS Data Exchange CloudFormation Stack Container Image Helm Chart Add-on for Amazon EKS Professional Services SaaS

Solutions AWS Well-Architected Business Applications CloudOps Data & Analytics Data Products DevOps Generative AI Infrastructure Software Internet of Things Machine Learning Migration Security

Industry ??industrySolutions.dropdown.advertising_and_marketing_en??Energy ??industrySolutions.dropdown.engineering_construction_and_real_estate_en??Financial Services Healthcare & Life Industrial ??industrySolutions.dropdown.life_sciences_en??Media & Entertainment Nonprofit ??industrySolutions.dropdown.power_and_utility_en??Public Health Public Sector ??industrySolutions.dropdown.retail_en????industrySolutions.dropdown.sustainability_en??Telecommunications

AWS Service Integrations AWS Control Tower AWS PrivateLink Pre-trained Amazon SageMaker Models

Resources Analyst Reports AWS Builder Resources Blogs Customer Success Stories Events Implementation Guides Videos Webinars Whitepapers

Your Saved List

Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

AWS Marketplace for Builders

Learning resources Build with popular tools

Improve LLM RAG responses using search data

How to use all the data in your Elastic cluster as context for your RAG implementation using large language models on Amazon Bedrock

Featured in this article

Blue cloud with white writing that says "salesforce"

Elastic Cloud

Try free in AWS Marketplace

Amazon Bedrock

Build and scale with foundation models

Foundational large language models (LLMs) are trained on huge datasets. To put this into context, consider that GPT-4o was trained on 13 trillion tokens! But there’s a caveat to that huge amount of data—it’s all public data and not up to date. And, most importantly, it does not include any data that is private or unique to your organization.

And the thing is, your organization already holds huge amounts of extremely valuable data that would make the inferences of foundational LLMs much more relevant to your users. We’re going to call this your organizational body of knowledge.

In this article, we’ll talk about using an organization’s existing body of knowledge as added context to foundational LLMs using retrieval augmented generation, and how Elastic makes that a whole lot simpler when you don’t have to be moving data around.

Let’s dig in!

Stop moving data around: Simplify your architecture with Elastic Cloud

Working with code and data

For those coming from a developer background, code is pretty much the primary artifact that one is used to working with. Yes, we need data to validate that our read and write operations are working as expected. And data is of course involved in making sure our business logic works as we anticipate.

Data, however, serves a very different purpose in the workflows of those working with machine learning.

Data and code are inseparably linked to the outcome which the developer is looking to achieve. Models require access to data, whether for training or as additional context in Retrieval-Augmented Generation (RAG) implementations working with foundational models.

If we look at the RAG scenario specifically, this historically meant getting data from different sources, transforming it, and then generating vector representations of it, This had to be stored in purpose-specific data repositories with support for that very specific data type, and which supported the required query capabilities the RAG implementation needs access to, so that it can find the right data relevant to the response the model is looking to generate.

Here’s a sample architecture of the traditional flow of storing vectors from multiple data sources:

Multiple data sources

Most organizations already hold lots of centralized knowledge

Most organizations already have the processes in place to consolidate all the relevant knowledge that users are looking to access. And this knowledge is usually made available through search-engine capabilities and knowledge bases that users have access to using traditional query methods—the good old “search” feature that most applications make available as a standard capability today.

And this knowledge is flowing from different sources already: documents stored in JSON format and data in relational databases as well as files in object storage that require more intricate processes for parsing and integration into a query-able, central repository.

And all of these sources, after being fetched and transformed, are stored and indexed into a system that provides specialized capabilities to enable fast, human-friendly searches on lots of indexed data: search engines.

Centralized knowledge

Serving knowledge through chat and search

For those that are paying close attention, I’m sure you’ve already noticed how there’s a lot of similarities to both diagrams above:

They both get data from application specific sources.
They both are using something (EMR, in the examples above) to fetch, process, and store that data somewhere.
They’re both storing the processed data in some repository (Amazon MemoryDB in the first example and Elastic Cloud in the second one).
They both make the stored data available for consumption to end users.

The fundamental difference between both scenarios is the fact that processed data will end up in a different format and will be made available to users using a different mechanism (in the first example, most likely through a chatBot with context provided by the data in Amazon MemoryDB; in the second scenario via an API).

Putting both things together

Let’s put these two diagrams together without any optimization and see how they look:

Without optimization

We have the same data sources, but now we’re running two different sets of pipelines with Amazon EMR, one to push data into Elastic Cloud for API based querying, and another pushing data for vector generation and eventual access using RAG—exemplified by using LangChain in the diagram above. The end user querying two separate data stores depending on the mechanism by which they choose to query, whether the chatbot or the traditional search functionality.

Now, let’s make this simpler!

And now we get to the revolutionary concept of this article, one that we can accomplish thanks to Elastic Cloud’s Elastic Learned Sparse Encoder (ELSER).

ELSER is a retrieval model that enables you to perform semantic searches on data stored in your ElasticSearch cluster. And not only does it simplify your architecture but, also, given the associative and weighted mechanism that the model employs to search, provides overall better data to the responses generated by the LLM.

Let’s make our architecture now one where we remove all unnecessary components and optimize to use Elastic as the single place to store our organizational knowledge, regardless of the mode of consumption:

Elastic Learned Sparse Encoder (ELSER)

Easier, right? Let’s look at some of the changes that make this solution using Elastic Cloud’s ELSER capabilities so much more efficient:

We have removed Amazon MemoryDB from the architecture, since Elastic serves all our requirements.
We can use the same pipelines that already kept Elastic up to date with data for API consumption for splitting our data into passages that make ELSER work much more efficiently.
LangChain can actually index and store vectors directly into Elastic as well, completely eliminating the need for additional data repositories.
Amazon Bedrock dramatically reduces the effort in running a foundational LLM by providing fully serverless capabilities and off-the-shelf access to the most relevant large language models available today.

Let’s look at some benefits

This solution has many benefits that stem from the reduction of moving pieces and consolidation of data.

Since we’re using the same pipelines and a single data repository, we have a relevant assurance that the data available both via traditional API queries as well as data included in LLM responses will represent the same source of truth at the same point in time.

We’re also reducing the cost and effort in data storage and transport, as well as making it easier for data engineers to ensure data remains consistent when source schema changes, by requiring them to modify and update fewer moving components.

Elastic Cloud’s ELSER capabilities provide our solution with improved accuracy and relevance in responses generated by the LLM.

And, last but not least, by switching to services like Amazon Bedrock and using Elastic Cloud we’re close to eliminating all effort related to operating and scaling the underlying infrastructure while being able to access leading-edge capabilities.

This is awesome! What do I do next?

Look for Elastic Cloud in AWS Marketplace so you can start trying this solution out using the available free trial. Getting it in AWS Marketplace will make it very easy to configure Elastic to work with your AWS account.

And, finally, be on the lookout for a hands-on lab we’ll be releasing soon, where you’ll be able to get step-by-step guidance on how to actually build this solution on your own AWS environment.

See you all soon!

Get hands on

Free Trial

Elastic Cloud

Try this use-case for yourself with a free trial of Elastic Cloud in AWS Marketplace.

Resources

AWS Marketplace for Builders

Explore technical guidance to help with your AWS use-case, find the right tool for the job, and use free trials to accelerate your proof-of-concept.

Why AWS Marketplace?

Try SaaS products free with your AWS account to establish your proof-of-concept then pay-as-you-go in production with AWS Billing.

AWS Marketplace and Salesforce: Data Cloud

Quickly go from POC to production - access free trials using your AWS account, then pay as you go.

AWS Marketplace and Salesforce: Service Cloud

Add capabilities to your tech stack using fast procurement and deployment, with flexible pricing and standardized licensing.

Get started free in AWS Marketplace

AWS Marketplace and Salesforce: Sales Cloud

Consolidate and optimize costs for your cloud infrastructure and third-party software, all centrally managed with AWS.

Explore AWS Marketplace

Discover benefits

Review the features and advantages of using AWS Marketplace

Find software & solutions

Browse by industry or solution category

Browse all products

Filter through all products available in AWS Marketplace

Explore resources

Find webinars, whitepapers, implementation guides, and analyst reports

Speak to an expert

Speak with an AWS Marketplace expert to help choose and integrate software