Amazon MSK now supports vector embedding generation using Amazon Bedrock

Posted on: Nov 6, 2024

Amazon MSK (Managed Streaming for Apache Kafka) now supports new Managed Streaming for Apache Flink blueprints to generate vector-embeddings using Amazon Bedrock, making it easier to build real-time AI applications powered by up-to-date, contextual data. This blueprint simplifies the process of incorporating the latest data from your Amazon MSK streaming pipelines into your generative AI models, eliminating the need to write custom code to integrate real-time data streams, vector databases, and large language models.

With just a few clicks, customers can configure the blueprint to continuously generate vector embeddings using Bedrock's embedding models, then index those embeddings in Amazon OpenSearch for their Amazon MSK data streams. This allows customers to combine the context from real-time data with Bedrock's powerful large language models to generate accurate, up-to-date AI responses without writing custom code. Customers can also choose to improve the efficiency of data retrieval using built-in support for data chunking techniques from LangChain, an open-source library, supporting high-quality inputs for model ingestion. The blueprint manages the data integration and processing between MSK, the chosen embedding model, and the Open Search vector store, allowing customers to focus on building their AI applications rather than managing the underlying integration.

Real-time vector embedding blueprint is generally available in the US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Paris), Europe (London), Europe (Ireland) and South America (Sao Paulo) AWS Regions. Visit the Amazon MSK documentation for the list of additional Regions, which will be supported over the next few weeks. To learn more about how to use the blueprint to generate real-time vector embeddings from your Amazon MSK data, visit the AWS blog.