AWS Cloud Enterprise Strategy Blog

Measuring the Impact of AI Assistants on Software Development

Value Measurement

The speed of typing out code has never ever been the bottleneck for software development (not since keyboards became widespread from the 60s or 70s)

—Gergely Orosz

Software development is a complex value delivery system involving many interdependent roles, including developers, product managers, and platform engineers. Dependencies create potential bottlenecks, such as pull request review queues, that limit your entire system’s speed. The theory of constraints tells us that your system’s bottlenecks—not your fastest developer—determine how quickly you deliver value to your customers.

As AI coding assistants gain widespread adoption, individual developers have focused on how these tools make writing code faster. But focusing only on code is like making only one machine in a production line 30% faster. If you don’t optimize every process—quality control, packaging, shipping—production may be faster, but customers don’t receive their products sooner.

When you use AI assistants to speed up coding, bottlenecks shift elsewhere in the value stream. Delays accumulate when product managers need to clarify requirements because they can’t define clear acceptance criteria. Code review slows down when senior developers are overwhelmed by reviewing AI-generated code that, while syntactically correct, raises questions about its architecture.

My AWS colleagues Phil Le-Brun and Joe Cudby put it this way: “A clear shift is underway from a narrow focus on an individual developer’s productivity to a more expansive understanding of team development productivity at the organisational and SDLC levels.”

As executives, we must look beyond individual developer productivity and understand how AI assistants can transform the entire software delivery process.

Measuring Cost to Serve Software

At AWS we approach software development with systems thinking. As my colleague Jim Haughwout, VP of Software Builder Experience, recently shared: “Traditional measures of development productivity fall short” because they ignore the interconnected nature of software delivery. This understanding led Jim and his team to develop the Cost to Serve Software (CTS-SW) framework.

CTS-SW measures the total cost of delivering a unit of software (including development, infrastructure, and operational costs) divided by the number of software delivery units (e.g., deployments for microservices or pull request completions for monoliths). It measures the entire delivery system’s performance—not just individual coding speed. We used CTS-SW to systematically optimize our software pipeline, reducing costs by 15.9% YoY in 2024.

Why AI Accelerates More Than Coding

Systems thinking becomes even more critical as AI capabilities evolve. Kiro, AWS’s specification-driven integrated development environment (IDE), exemplifies this approach. It doesn’t just generate code—it helps developers automate documentation and create unit tests.

I’m seeing customers use Kiro to deploy AI across all development teams: Product managers use Kiro for requirements analysis, UX designers for rapid prototyping, and operations teams for automated observability.

When parts of your system accelerate at different speeds, new bottlenecks arise that require your attention. CTS-SW gives you crucial insight into the software development process by using tension metrics: safeguard measurements that ensure improvements in one area don’t compromise critical areas like security or resilience.

To capture AI’s broader impact on the organization, I recommend these additional metrics:

  1. Delivered business value: Increased conversion rate, revenue impact of new features, reduced service calls
  2. Customer cycle time: Days from feature request to customer use, time to resolve customer-reported issues
  3. Development throughput: Features delivered per week that customers actually use, successful releases per day
  4. Quality and reliability: Production incident rates, customer satisfaction scores, and security vulnerability resolution time
  5. Team satisfaction: Retention rates, engagement surveys, and internal satisfaction with development processes

Track these metrics continuously for every team and domain in your organization to see how performance develops over time.

Comparing Teams to Measure AI’s Impact

To measure the effect of AI assistants on your process, compare teams that use AI assistants to those that don’t. I recommend using A/B testing.

A/B testing means giving different groups different versions of a product to see which one works better. In larger organizations, it works wonderfully to measure the impact of new tools and workflows—such as AI assistants.

Identify two teams working on similar products with similar technology stacks and complexity. Make sure the teams are of similar size and work on similar problem domains. Give one team access to AI coding assistants while the other continues with current practices. Track key business metrics— delivered business value, cycle time, throughput, quality, cost, and team motivation—over the next 2-3 release cycles.

The length of this experiment depends on its hypothesis and your organization’s cycle time for feature development. High-performing organizations with short cycle times can iterate faster, which makes experimentation more efficient.

During your A/B tests, don’t just measure cycle time and throughput at the end of your delivery pipeline. Use value stream analysis to measure flow at each critical handover point. Identify where work accumulates after you introduce AI and optimize those constraints accordingly.

Start Measuring Now

If you want to understand how AI assistants affect your organization, don’t rely on social media reports. Start collecting your own data now. An A/B test will tell you more about AI’s impact on your business than the percentage of AI-generated code, which is a common vanity metric that measures activity, not outcomes.

AI assistants will likely transform how we work and develop software—but not all organizations will benefit equally. The winners will be those who measure and optimize their entire value delivery system, not just their coding speed.

References

  1. A CTO’s Guide to Measuring Software Development Productivity, March 2025
  2. Quantifying the Impact of Developer Experience: Amazon’s 15.9% Breakthrough, July 2025
  3. Development Productivity in the Age of Generative AI, May 2024
  4. Build a Value-Driving AI Strategy for Business Growth, Garnter
  5. Technology alone is never enough for true productivity, McKinsey, September 2024
Matthias Patzak

Matthias Patzak

Matthias joined the Enterprise Strategist team in early 2023 after a stint as a Principal Advisor in AWS Solutions Architecture. In this role, Matthias works with executive teams on how the cloud can help to increase the speed of innovation, the efficiency of their IT, and the business value their technology generates from a people, process and technology perspective. Before joining AWS, Matthias was Vice President IT at AutoScout24 and Managing Director at Home Shopping Europe. In both companies he introduced lean-agile operational models at scale and led successful cloud transformations resulting in shorter delivery times, increased business value and higher company valuations