Amazon Bedrock expands support for Service Quotas

Posted on: May 27, 2026

Amazon Bedrock is a fully managed service that provides secure, enterprise-grade access to high-performing foundation models from leading AI companies, enabling you to build and scale generative AI applications. Amazon Bedrock customers can now view inference quotas for the bedrock-mantle endpoint through AWS Service Quotas. This gives customers a familiar, consistent way to track limits for this endpoint, the same way they already do for the bedrock-runtime endpoint and other AWS services, and gives them clear visibility into the limits that apply to their workloads.

The bedrock-mantle endpoint supports the OpenAI Responses API, OpenAI Chat Completions API, and the Anthropic Messages API, letting customers run existing OpenAI or Anthropic based applications on Amazon Bedrock with minimal code changes. AWS Service Quotas now exposes per-model input-tokens-per-minute and output-tokens-per-minute quotas for supported models on the endpoint.

With this launch, customers gain visibility into how much limits they have on the bedrock-mantle endpoint and can proactively plan for production scale. To get started, open the AWS Service Quotas console, choose Amazon Bedrock, and search for "Bedrock Mantle" to view your current quotas. To request an increase to any of these quotas, follow the standard Amazon Bedrock limit increase process. Service Quotas support for the bedrock-mantle endpoint is available in all AWS Regions where the endpoint is offered: US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Mumbai, Tokyo, Sydney, Jakarta), Europe (Frankfurt, Ireland, London, Milan, Stockholm), and South America (São Paulo). To learn more, see Quotas for Amazon Bedrock