Artificial Intelligence
Category: Generative AI
How Rufus doubled their inference speed and handled Prime Day traffic with AWS AI chips and parallel decoding
Rufus, an AI-powered shopping assistant, relies on many components to deliver its customer experience including a foundation LLM (for response generation) and a query planner (QP) model for query classification and retrieval enhancement. This post focuses on how the QP model used draft centric speculative decoding (SD)—also called parallel decoding—with AWS AI chips to meet the demands of Prime Day. By combining parallel decoding with AWS Trainium and Inferentia chips, Rufus achieved two times faster response times, a 50% reduction in inference costs, and seamless scalability during peak traffic.
New Amazon Bedrock Data Automation capabilities streamline video and audio analysis
Amazon Bedrock Data Automation helps organizations streamline development and boost efficiency through customizable, multimodal analytics. It eliminates the heavy lifting of unstructured content processing at scale, whether for video or audio. The new capabilities make it faster to extract tailored, generative AI-powered insights like scene summaries, key topics, and customer intents from video and audio. This unlocks the value of unstructured content for use cases such as improving sales productivity and enhancing customer experience.
GuardianGamer scales family-safe cloud gaming with AWS
In this post, we share how GuardianGamer uses AWS services including Amazon Nova and Amazon Bedrock to deliver a scalable and efficient supervision platform. The team uses Amazon Nova for intelligent narrative generation to provide parents with meaningful insights into their children’s gaming activities and social interactions, while maintaining a non-intrusive approach to monitoring.
Optimize query responses with user feedback using Amazon Bedrock embedding and few-shot prompting
This post demonstrates how Amazon Bedrock, combined with a user feedback dataset and few-shot prompting, can refine responses for higher user satisfaction. By using Amazon Titan Text Embeddings v2, we demonstrate a statistically significant improvement in response quality, making it a valuable tool for applications seeking accurate and personalized responses.
Integrate Amazon Bedrock Agents with Slack
In this post, we present a solution to incorporate Amazon Bedrock Agents in your Slack workspace. We guide you through configuring a Slack workspace, deploying integration components in Amazon Web Services, and using this solution.
Secure distributed logging in scalable multi-account deployments using Amazon Bedrock and LangChain
In this post, we present a solution for securing distributed logging multi-account deployments using Amazon Bedrock and LangChain.
HERE Technologies boosts developer productivity with new generative AI-powered coding assistant
HERE collaborated with the GenAIIC. Our joint mission was to create an intelligent AI coding assistant that could provide explanations and executable code solutions in response to users’ natural language queries. The requirement was to build a scalable system that could translate natural language questions into HTML code with embedded JavaScript, ready for immediate rendering as an interactive map that users can see on screen.
Detect hallucinations for RAG-based systems
This post walks you through how to create a basic hallucination detection system for RAG-based applications. We also weigh the pros and cons of different methods in terms of accuracy, precision, recall, and cost.
How Apoidea Group enhances visual information extraction from banking documents with multimodal models using LLaMA-Factory on Amazon SageMaker HyperPod
Building on this foundation of specialized information extraction solutions and using the capabilities of SageMaker HyperPod, we collaborate with APOIDEA Group to explore the use of large vision language models (LVLMs) to further improve table structure recognition performance on banking and financial documents. In this post, we present our work and step-by-step code on fine-tuning the Qwen2-VL-7B-Instruct model using LLaMA-Factory on SageMaker HyperPod.
How Qualtrics built Socrates: An AI platform powered by Amazon SageMaker and Amazon Bedrock
In this post, we share how Qualtrics built an AI platform powered by Amazon SageMaker and Amazon Bedrock.