Artificial Intelligence

Category: Business Intelligence

Build a proactive AI cost management system for Amazon Bedrock – Part 2

In this post, we explore advanced cost monitoring strategies for Amazon Bedrock deployments, introducing granular custom tagging approaches for precise cost allocation and comprehensive reporting mechanisms that build upon the proactive cost management foundation established in Part 1. The solution demonstrates how to implement invocation-level tagging, application inference profiles, and integration with AWS Cost Explorer to create a complete 360-degree view of generative AI usage and expenses.

Build a proactive AI cost management system for Amazon Bedrock – Part 1

In this post, we introduce a comprehensive solution for proactively managing Amazon Bedrock inference costs through a cost sentry mechanism designed to establish and enforce token usage limits, providing organizations with a robust framework for controlling generative AI expenses. The solution uses serverless workflows and native Amazon Bedrock integration to deliver a predictable, cost-effective approach that aligns with organizational financial constraints while preventing runaway costs through leading indicators and real-time budget enforcement.

AWS RAG API architecture diagram illustrating end-to-end query processing with knowledge base integration and LLM response generation

Demystifying Amazon Bedrock Pricing for a Chatbot Assistant

In this post, we’ll look at Amazon Bedrock pricing through the lens of a practical, real-world example: building a customer service chatbot. We’ll break down the essential cost components, walk through capacity planning for a mid-sized call center implementation, and provide detailed pricing calculations across different foundation models.

Cost tracking multi-tenant model inference on Amazon Bedrock

In this post, we demonstrate how to track and analyze multi-tenant model inference costs on Amazon Bedrock using the Converse API’s requestMetadata parameter. The solution includes an ETL pipeline using AWS Glue and Amazon QuickSight dashboards to visualize usage patterns, token consumption, and cost allocation across different tenants and departments.

A screenshot of the AI assistant

Democratize data for timely decisions with text-to-SQL at Parcel Perform

The business team in Parcel Perform often needs access to data to answer questions related to merchants’ parcel deliveries, such as “Did we see a spike in delivery delays last week? If so, in which transit facilities were this observed, and what was the primary cause of the issue?” Previously, the data team had to manually form the query and run it to fetch the data. With the new generative AI-powered text-to-SQL capability in Parcel Perform, the business team can self-serve their data needs by using an AI assistant interface. In this post, we discuss how Parcel Perform incorporated generative AI, data storage, and data access through AWS services to make timely decisions.

Build a just-in-time knowledge base with Amazon Bedrock

Traditional Retrieval Augmented Generation (RAG) systems consume valuable resources by ingesting and maintaining embeddings for documents that might never be queried, resulting in unnecessary storage costs and reduced system efficiency. This post presents a just-in-time knowledge base solution that reduces unused consumption through intelligent document processing. The solution processes documents only when needed and automatically removes unused resources, so organizations can scale their document repositories without proportionally increasing infrastructure costs.

Choosing the right approach for generative AI-powered structured data retrieval

In this post, we explore five different patterns for implementing LLM-powered structured data query capabilities in AWS, including direct conversational interfaces, BI tool enhancements, and custom text-to-SQL solutions.

Mental model for choosing Amazon Bedrock options for cost optimization

Effective cost optimization strategies for Amazon Bedrock

With the increasing adoption of Amazon Bedrock, optimizing costs is a must to help keep the expenses associated with deploying and running generative AI applications manageable and aligned with your organization’s budget. In this post, you’ll learn about strategic cost optimization techniques while using Amazon Bedrock.

Automating complex document processing: How Onity Group built an intelligent solution using Amazon Bedrock

In this post, we explore how Onity Group, a financial services company specializing in mortgage servicing and origination, transformed their document processing capabilities using Amazon Bedrock and other AWS services. The solution helped Onity achieve a 50% reduction in document extraction costs while improving overall accuracy by 20% compared to their previous OCR and AI/ML solution.

Image of an AWS Architecture diagram

Build an intelligent community agent to revolutionize IT support with Amazon Q Business

In this post, we demonstrate how your organization can reduce the end-to-end burden of resolving regular challenges experienced by your IT support teams—from understanding errors and reviewing diagnoses, remediation steps, and relevant documentation, to opening external support tickets using common third-party services such as Jira.