Amazon Bedrock offers select FMs for batch inference at 50% of on-demand inference price
Last year, we introduced support for batch inference in preview, allowing you to process prompts in batch to get responses for model evaluation, experimentation, and offline processing. Beginning today, Amazon Bedrock supports batch inference in general availability in all supported AWS regions for supported models. Use batch inference to run multiple inference requests asynchronously, and improve the performance of model inference on large datasets. Amazon Bedrock offers select foundation models (FMs) from leading AI providers like Anthropic, Meta, Mistral AI, and Amazon for batch inference at 50% of on-demand inference pricing. Completion time of batch inference depends on various factors like the size of the job, but you can expect completion timeframe of a typical job within 24 hours. You can learn more in our batch inference documentation and you can also reference our API reference documentation.
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, as well as Amazon via a single API. Amazon Bedrock also provides a broad set of capabilities customers need to build generative AI applications with security, privacy, and responsible AI built in. These capabilities help you build tailored applications for multiple use cases across different industries, helping organizations unlock sustained growth from generative AI while ensuring customer trust and data governance.
To more information about Amazon Bedrock, visit the Amazon Bedrock page and see the Amazon Bedrock documentation for more details.