Amazon Bedrock now supports Responses API from OpenAI

Posted on: Dec 4, 2025

Amazon Bedrock now supports Responses API on new OpenAI API-compatible service endpoints. Responses API enables developers to achieve asynchronous inference for long-running inference workloads, simplifies tool use integration for agentic workflows, and also supports stateful conversation management. Instead of requiring developers to pass the entire conversation history with each request, Responses API enables them to automatically rebuild context without manual history management. These new service endpoints support both streaming and non-streaming modes, enable reasoning effort support within Chat Completions API, and require only a base URL change for developers to integrate within existing codebases with OpenAI SDK compatibility.

Chat Completions with reasoning effort support is available for all Amazon Bedrock models that are powered by Mantle, a new distributed inference engine for large-scale machine learning model serving on Amazon Bedrock. Mantle simplifies and expedites onboarding of new models onto Amazon Bedrock, provides highly performant and reliable serverless inference with sophisticated quality of service controls, unlocks higher default customer quotas with automated capacity management and unified pools, and provides out-of-the-box compatibility with OpenAI API specifications. Responses API support is available today starting with OpenAI's GPT OSS 20B/120B models, with support for other models coming soon.

To get started, visit the service documentation here