Rad AI Drives Revenue by 10 Times Using Amazon EC2 P4d Instances Powered by NVIDIA

2021

Nearly 90 percent of US radiologists operate at or over capacity, according to a Mayo Clinic study. Rad AI helps alleviate their workloads by training machine learning (ML) models to read detailed documents and automatically summarize results customized to the radiologist’s language, which ordering physicians use to identify ailments and devise treatment plans. Rad AI works with 16 percent of the US radiology market, including 6 of the 10 largest radiology groups, and wanted to expand its solution to serve more customers. To increase its ML inference speed and generate real-time conclusions, the company chose to use Amazon Web Services (AWS).

Rad AI migrated its document summary applications running on on-premises GPU servers to Amazon Elastic Compute Cloud (Amazon EC2) P4d Instances powered by NVIDIA A100 Tensor Core GPUs. By deploying its applications on Amazon EC2 P4d Instances, Rad AI significantly improved its ML inference times, delivering faster, more accurate reports to radiologists and improving the quality of patient care.

Improving Radiologist Efficiency Using Machine Learning

Rad AI is a software-as-a-service startup that aims to increase the quality of healthcare by streamlining radiology workflows. “Radiologists are quite efficient, but their study volume is so high that fatigue is common,” says Niven Shah, business development and strategy manager at Rad AI. “Our products use the latest advances in natural language processing to automatically generate customized conclusions for radiology reports, along with follow-up recommendations based on national guidelines.”

Rad AI reduces the number of words radiologists dictate by 30–35 percent per day and saves radiologists about 1 hour per 9-hour shift. Its products tie into existing workflows and operate as zero-click solutions. “We created Rad AI specifically to reduce radiologist burnout, improve the quality of patient care, and ensure that our patients get the appropriate follow-up and treatment at the right time,” says Dr. Jeff Chang, radiologist and cofounder at Rad AI. The company previously used Amazon EC2 P3 Instances to deploy its ML applications but wanted higher performance and faster inference speeds to serve more customers. Rad AI saw a way to meet its goals by migrating its ML models to Amazon EC2 P4d Instances, which are powered by NVIDIA A100 GPUs.

Amazon EC2 P4d Instances provide 320 GB of GPU memory per instance and are the first to support 400 Gbps of high-speed networking in the cloud. Their high performance and low latency make them ideal for processing larger documents at a faster rate. Using AWS services would also help Rad AI facilitate HIPAA compliance and meet the requirements of its System and Organization Controls 2 Type II certification, streamlining the onboarding of new radiology groups and health systems.

Increasing Performance, Scalability, and Inference Speeds to Serve Customers Faster

Rad AI completed the migration in 2021, improving its ML inference speeds and overall performance. “By migrating to Amazon EC2 P4d Instances, we improved our real-time inference speeds by 60 percent,” says Ali Demirci, senior software engineer at Rad AI. “Because we can generate summaries in real time, this solution made an immediate impact on the customer experience.” Rad AI has seen a 136 percent increase in performance and an 11 percent faster throughput by using Amazon EC2 P4d Instances for its cloud-based deployments compared to those on premises. Faster speeds, improved performance, and cloud scaling enable the startup to deliver its solution to more customers, working with smaller private practices as well as multi-billion-dollar healthcare systems.

Rad AI’s solution now delivers CT and MRI scan report summaries in 3 seconds instead of the 10 seconds it used to take and an x-ray report summary in 0.7 seconds compared to 2.5 seconds previously. By training some of its ML models on Amazon EC2 P4d Instances, Rad AI reduced training duration by 2.4 times. With improved inference speeds, radiologists are now able to deliver more accurate reports with appropriate follow-up recommendations to physicians faster. Physicians are then able to use these reports to diagnose conditions and create treatment plans, which in turn improves patient outcomes.

Rad AI uses Amazon Elastic Container Service (Amazon ECS), a fully managed container orchestration service, to deploy several ML models per day. “Being able to continuously deploy using Amazon ECS enables us to respond to customer feedback quickly,” says Demirci. “We can simply tweak the models or make immediate changes as needed. Our ML team can provision instances quickly and automatically, which helps streamline experimentation on model improvements.” By migrating its ML inference to the cloud, Rad AI also removed the need to procure and provision infrastructure for its on-premises data centers. Instead, Rad AI is able to provision instances on demand, thereby optimizing its operating costs.

Rad AI also chose to develop, train, and deploy its ML technology using PyTorch, an open-source ML framework. PyTorch enables Rad AI to disassemble and reassemble components of its ML workflows for simple debugging and quick experimentation with newer, more advanced iterations of its ML training flow. Using PyTorch, the team can deliver more complex model architectures with less time invested in development and iteration.

The company also scaled its services on AWS, expanding to serve new clients. “When you need to deploy large ML models like we do, it requires a significant amount of GPU memory,” says Andriy Mulyar, ML engineer at Rad AI. “Amazon EC2 P4d Instances come with 40 GB of high-bandwidth memory per GPU and can effectively meet our memory requirements. Now we can scale our ML applications on demand without needing to provision physical hardware. We’re able to generate outputs for our customers at a much faster rate, which in turn increases our speed of innovation.” Because Rad AI is able to scale to serve more customers, the startup grew its client base by over 100 percent in 2021. Rad AI also increased its recurring revenue by more than 10 times in 2021, compared to all of 2020.

Optimizing Speed, Performance, and Customer Success on AWS

By migrating to Amazon EC2 P4d Instances, Rad AI increased revenue, accelerated innovation speed, and seamlessly scaled its ML applications, delivering real-time benefits to its customers and expanding its reach. For example, Radiology Associates of North Texas, the largest private radiology practice in Texas, expanded Rad AI services to all 225 of its radiologists after testing the company’s AWS-powered solution. In the future, Rad AI plans to further automate its data pipeline on AWS and will launch new ML-driven products to improve patient care.

Rad AI built a fast, high-performing solution quickly using AWS for ML application development and deployment. “Whenever you reach out for help from the AWS team, you’re connected with a knowledgeable person and issues get resolved very quickly,” says Demirci. “Working with the AWS team has been a huge benefit for us.”

About Rad AI

Rad AI is a startup using artificial intelligence to streamline radiology workflows and improve patient care. Headquartered in Berkeley, California, Rad AI strives to enhance access to high-quality healthcare while reducing physician burnout.

Benefits of AWS

• Increased 2021 revenue by 10x
• Increased performance by 136% compared to existing on-premises deployments
• Increased ML inference speeds by 60%
• Delivers CT and MRI scan reports in 3 seconds versus 10 seconds
• Delivers x-ray reports in 0.7 seconds versus 2.5 seconds
• Strengthened customer satisfaction
• Improved radiology patient outcomes
• Streamlined product deployments

AWS Services Used

Amazon EC2

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.

Learn more »

Amazon Elastic Container Service

Amazon Elastic Container Service (Amazon ECS) is a fully managed container orchestration service that helps you easily deploy, manage, and scale containerized applications.