2024

Refact.ai Sees 1.5x Price Performance as the First AI Coding Assistant on AWS Inferentia2

Learn how technology startup Refact.ai offers cost-effective coding assistance with enhanced security features using AWS Inferentia2.

Key Outcomes

Up to 20%

higher performance

1.5x

higher performance per dollar

Overview

More businesses are accelerating application development using coding assistants that are powered by artificial intelligence (AI). At the same time, companies have concerns about whether that process means transferring their code base—their critical intellectual property—to third parties. These companies are also looking for solutions that will broaden adoption in their organizations while keeping costs low. Technology startup Refact.ai addresses these concerns by providing an AI assistant that runs in the customer’s own trusted environment.

To use Refact.ai’s software-as-a-service product, customers need to run GPUs, which may not always be readily available or meet budget constraints, especially for large workloads. Refact.ai, which uses Amazon Web Services (AWS), adapted its solution to run on Amazon Elastic Compute Cloud (Amazon EC2) instances, which provide secure and resizable compute capacity. The company uses AWS Inferentia, specifically AWS Inferentia2 chips, which are designed by AWS to deliver high performance at low cost in Amazon EC2 for deep learning and generative AI inference applications. Now, Refact.ai provides the first AI coding assistant to run on Inf2, enhancing price performance for customers who don’t have to worry about finding GPUs or exceeding their budget.

Two professionals collaborating at a desk in a modern office environment with a laptop, notebooks, and city view in the background.

About Refact.ai

Refact.ai is a coding assistant for Visual Studio Code, JetBrains, and other integrated development environments that boosts developers’ productivity while keeping data secure. Refact.ai is going a step further by developing an Autonomous AI Agent that will complete developers’ tasks end to end.

Opportunity | Using AWS Inferentia2 to Improve Cost Efficiency for Refact.ai

Refact.ai is an AI coding assistant that provides coding suggestions and context-aware answers through chat in integrated development environments such as Visual Studio Code and JetBrains, accelerating developers’ tasks. The tool specifically empowers customers to maintain control over their own data, thus enhancing data security. Customers can deploy Refact.ai within their own environments so that data does not need to be sent to a third party to run inference. As the industry moves in this direction, Refact.ai is also building AI agents, autonomous AI programs that can handle tasks without manual intervention.

Refact.ai is also available in AWS Marketplace. The product uses retrieval-augmented generation to deliver highly accurate, context-aware code suggestions by continuously learning from users’ internal codebases. It can be deployed on premises on customer servers so that data remains fully secure and under customer control.

Given the high cost and lack of capacity availability for GPUs, Refact.ai needed to provide customers with a solution that is built on readily available, high-performing, and cost-effective compute. “One of the main challenges for a small company to survive in this industry is to have access to computing resources and GPUs,” says Oleg Kiyashko, cofounder of Refact.ai. “Therefore, we were looking for compute options that are more readily available but that could also help us reduce inference costs.”

In August 2024, Refact.ai tested AWS Inferentia2–powered Amazon EC2 Inf2 Instances, which provide high performance at low cost in Amazon EC2 for generative AI inference. Refact.ai then adapted the large language model StarCoder, which it used for its coding assistant, to run on these instances using AWS Neuron, a software development kit to optimize machine learning on AWS Trainium and AWS Inferentia accelerators. The result is now available in AWS Marketplace. “We migrated to AWS Inferentia2 to be more cost-effective for new users, and some of our previous customers have also migrated to the new product,” says Kirill Starkov, senior developer at Refact.ai. “I really appreciated the AWS service team during the migration because it was very open to our questions.”

Solution | Getting 1.5x Higher Performance per Dollar Using AWS Inferentia2

Using AWS Inferentia2, Refact.ai’s customers get up to 1.5 times the performance per dollar spent. While testing Inf2 Instances, the company was able to speed up performance by up to 20 percent compared with similar instances. And crucially, Inf2 Instances are more widely available, which helps Refact.ai attract new clients and scale faster. “Availability is a very important aspect for our clients,” says Kiyashko.

One of Refact.ai’s customers has already started running Refact.ai on AWS Inferentia2 chips and received good results within weeks. “Refact.ai is significantly reducing the time spent on writing boilerplate and repetitive code,” says Yury Zhytkou, director of engineering at Kepler Team, in a review on AWS Marketplace. “We’ve found the system to be stable, and getting instances up and running using AWS Neuron is straightforward. Refact.ai’s support team has been responsive and quick to address any questions we’ve had, which helps keep things moving efficiently.”

Under the AWS shared responsibility model, Refact.ai customers have a trusted environment in which to deploy Refact.ai and get feedback and suggestions on their code. Their data stays in their AWS environment without being sent to a third party.

Refact.ai is also building an autonomous AI agent that will handle engineering tasks. “The AI agent will learn from every interaction and organize the experience into a knowledge base, becoming smarter over time,” says Katrin Maikova, product marketing manager at Refact.ai. “Refact.ai Agent goes beyond the codebase, becoming a digital twin of the developer by accessing and interacting with all the tools and resources developers typically use—such as databases and documentation. Refact.ai Agent learns from these interactions and becomes more valuable as it integrates with existing workflows and tools.”

“These kinds of AI agents are going to become more common,” says Kiyashko. “AI assistance tools are moving toward these autonomous agents, which will learn and can deliver outcomes. That means we’re going to have a lot more inference and much more GPU usage in the coming years.” As a result, using cost-efficient AWS Inferentia2 instances will become even more valuable for customers.

AWS Inferentia2 is an ideal size—large enough to run models while not being so large that capacity goes unused—for customers to take advantage of GPU resources to run these coding agents. “We run a typical agent on a 7-billion-parameter model, which works perfectly on an AWS Inferentia2 core,” says Kiyashko.

Outcome | Developing AI Autopilots with Affordable Inference Using AWS Inferentia2

Refact.ai continues to develop AI that will act not only as a copilot for developers but also as an autopilot that is capable of completing tasks autonomously. “We foresee a future where a lot of code is written by the machines, and all these AI agents will need a place to run inference,” says Kiyashko. “Our solution for AWS customers will be based on AWS Inferentia2 because it provides excellent performance at a low price.”

Architecture Diagram

We run a typical agent on a 7-billion-parameter model, which works perfectly on an AWS Inferentia2 core.

Oleg Kiyashko

Co-Founder, Refact.ai

Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.

Contact Sales

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages