Skip to main content
2025

Composio Increases AI Coding Agent Accuracy by 50% with Multi-Model Testing on Amazon Bedrock

Learn how Composio streamlined AI model experimentation with AWS, enhancing coding agent performance to support AI-driven automation at scale.

Benefits

improvement in coding agent accuracy

reduction in AI model testing time

increase in token throughput per minute, from 5M to 10M

Overview

Composio provides a communication layer for AI agents and large language models (LLMs), helping developers streamline AI-powered automation. The company sought to experiment with a variety of foundation models; however, managing separate model providers quickly became complex, leading to a drain on engineering effort. Composio decided to centralize its AI model testing on Amazon Web Services (AWS), eliminating the need for separate integrations and enabling seamless parallel testing. The move improved model accuracy by 50 percent, reduced AI model experimentation time by two weeks, and doubled token throughput from five million to ten million per minute. These advancements significantly helped Composio achieve the No. 1 ranking on SWE-Bench within a short period of time.

Missing alt text value

About Composio

Composio is building the future of agentic actions by empowering AI agents to interact with hundreds of applications and tools, enabling autonomous actions that streamline workflows. Founded in the United States in 2023, the company provides a communication layer for AI agents and LLM apps, helping enterprises deploy and scale AI-driven solutions efficiently.

Opportunity | Streamlining AI Model Integration and Testing for Automation

Founded in India in 2023, Composio provides an AI integration platform that connects LLMs with external tools and services. The platform helps developers and enterprises develop AI agents that integrate seamlessly across different software environments, making AI-powered automation more efficient.

Initially, Composio managed separate integrations for LLMs across different providers, requiring developers to switch between multiple SDKs and APIs. This fragmented approach added complexity, making experimentation less efficient, while the lack of a centralized testing framework further restricted iteration speed. Meanwhile, an enterprise customer required a more efficient way to evaluate multiple AI models for automation workflows. “We and our clients needed a way to experiment rapidly with multiple AI models without the overhead of managing separate integrations,” recalls Soham Ganatra, co-founder and CEO at Composio.

Recognizing the need for a more efficient, unified approach to AI model evaluation, the company sought a way to streamline experimentation and support multiple LLMs without adding operational complexity.

Solution | Optimizing AI Model Experimentation with Amazon Bedrock

After evaluating various AI model deployment and testing solutions, Composio chose AWS for its scalability, security, and ability to centralize AI model testing. By adopting Amazon Bedrock, a generative AI solution, Composio empowered its developers to test multiple AI models in parallel and optimize performance more efficiently. “With Amazon Bedrock, we can experiment faster and optimize our AI models with minimal operational overhead,” explains Ganatra. The company also utilized Amazon Bedrock to determine that Anthropic Claude delivered the highest accuracy for its coding agent.

Beyond providing a scalable testing platform, AWS helped facilitate smooth deployment at scale—while ensuring security and compliance requirements were met. AWS’s SOC Type II compliance gave Composio confidence in handling data securely as it optimized AI-driven workflows. Initially, low throughput limits caused request throttling that prevented running large-scale experiments. AWS engineers worked closely with the Composio team to increase token throughput from five million to ten million per minute, enabling seamless parallel testing. “AWS worked closely with us to resolve throughput limitations, which allowed us to scale AI model testing without disruptions,” Ganatra adds.

Outcome | Increasing AI-Powered Coding Agent Accuracy by 50% and Accelerating Developer Adoption

Shifting AI model experimentation to Amazon Bedrock eliminated operational bottlenecks and cut testing time by two weeks, helping Composio’s engineering team iterate and optimize solutions faster. With model testing and comparison centralized, workflows were streamlined and AI agent performance scaled more effectively.

 “The unified testing platform gave us a huge efficiency boost and helped us make data-driven decisions faster,” says Ganatra. As a result, Composio’s AI-powered coding agent—built with SWE-Kit—increased accuracy by 50 percent, automated engineering tasks, reduced errors, and enhanced overall efficiency.

These advancements led to Composio securing the top ranking on SWE-Bench, a leading benchmark for evaluating AI-assisted code generation and problem-solving. This recognition validated the model’s accuracy and effectiveness compared to other AI-powered coding tools, strengthening Composio’s credibility. The platform’s success also attracted over 450 new developers and secured a major enterprise customer seeking AI-powered automation.

Additionally, support for parallelized experimentation has given Composio a competitive edge. “With AWS, we can scale experiments quickly to deliver high-performance AI automation solutions,” Ganatra concludes. “Amazon Bedrock enhances our model selection, reduces latency, and drives continuous performance improvements.”

Missing alt text value
With AWS, we can scale experiments quickly to deliver high-performance AI automation solutions. Amazon Bedrock enhances our model selection, reduces latency, and drives continuous performance improvements.

Soham Ganatra

Co-Founder & CEO, Composio

Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.

Contact Sales

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages