AWS Partner Databricks Helps Rivian Drive Into the Future of Electric Transportation

Executive Summary

Rivian is preserving the natural world for future generations with revolutionary Electric Adventure Vehicles (EAVs). With over 11,000 EAVs on the road generating multiple terabytes of Internet of Things (IoT) data per day, the company is using data insights and machine learning (ML) from Databricks running on Amazon Web Services (AWS) to improve vehicle health and performance. However, with legacy cloud tooling, Rivian struggled to cost-effectively scale pipelines and spent significant resources on maintenance— slowing their ability to be truly data-driven. Since moving to the Databricks Lakehouse Platform, Rivian now understands how a vehicle performs and how this impacts the driver using it. Equipped with these insights, Rivian is innovating faster, reducing costs, and ultimately, delivering a better driving experience to their customers.

Struggling to Democratize Data on a Legacy Platform

Building a world that will continue to be enjoyed by future generations requires a shift in the way the world operates. At the forefront of this movement is Rivian— an electric vehicle manufacturer focused on shifting the planet’s energy and transportation systems entirely away from fossil fuel. Today, Rivian’s fleet includes personal vehicles and involves a partnership with Amazon to deliver 100,000 commercial vans. Each vehicle uses IoT sensors and cameras to capture petabytes of data ranging from how the vehicle drives to how various parts function. With all this data at their fingertips, Rivian is using ML to improve the overall customer experience with predictive maintenance so that potential issues are addressed before they impact the driver.

Before Rivian even shipped its first EAV, it was already up against data visibility and tooling limitations that decreased output, prevented collaboration, and increased operational costs. They had 30 to 50 large and operationally complicated compute clusters at any given time, which became costly. Not only was the system difficult to manage, but the company experienced frequent cluster outages as well, forcing
teams to dedicate more time to troubleshooting than to data analysis. Additionally, data silos created by disjointed systems slowed the sharing of data, which further contributed to productivity issues. Required data languages and specific expertise of tool sets created a barrier to entry that limited developers from making full use of the data available. Jason Shiverick, Principal Data Scientist at Rivian, said the biggest issue was data access. “I wanted to open our data to a broader audience of less technical users so they could also leverage data more easily.”

Rivian knew that once its EAVs hit the market, the amount of data ingested would explode. To deliver the reliability and performance they promised, Rivian needed an architecture that would not only democratize data access but also provide a common platform to build innovative solutions that can help ensure a reliable and enjoyable driving experience. Because of their expertise in the field, Rivian selected AWS Partner Databricks and AWS as their partner and cloud provider.


Databricks Lakehouse empowers us to lower the barrier of entry for data access across our organization so we can build the most innovative and reliable electric vehicles in the world.”

Wassym Bensaid
Vice President of Software Development, Rivian

Predicting Maintenance Issues with Databricks Lakehouse

To modernize their data infrastructure, Rivian chose the Databricks Lakehouse Platform, a collaborative effort between AWS and Databricks. This powerful platform gave Rivian the ability to unify all their data into a common view for downstream analytics and ML. Now, unique data teams have a range of accessible tools to deliver actionable insights for different use cases from predictive maintenance to smarter product development via tools like AWS Direct Connect, Amazon Simple Storage Service (Amazon S3), Amazon Elastic Kubernetes Service (Amazon EKS), and Amazon Elastic Compute Cloud (Amazon EC2).

Rivian’s advanced driver-assistance systems (ADAS) team can now easily prepare telemetric accelerometer data to understand all EAV motions. This core recording data includes information about pitch, roll, speed, suspension, and airbag activity to help Rivian understand vehicle performance, driving patterns, and connected car system predictability. Based on these key performance metrics, Rivian can improve the accuracy of smart features and the control that drivers have over them.  Designed to take the stress out of long drives and driving in heavy traffic, features like adaptive cruise control, lane change assist, automatic emergency driving, and forward collision warning can be honed over time to continuously optimize the driving experience for customers.

Secure data sharing and collaboration were also facilitated with the Databricks Unity Catalog. Shiverick describes how unified governance for the lakehouse benefits Rivian productivity. “Unity Catalog gives us a truly centralized data catalog across all of our different teams,” he said. “Now we have proper access management and controls.” Venkat adds, “With Unity Catalog, we are centralizing data catalog and access management across various teams and workspaces, which has simplified governance.” End-to-end version controlled governance and auditability of sensitive data sources, like the ones used for autonomous driving systems, produce a simple but secure solution for feature engineering. This gives Rivian a competitive advantage in the race to capture the autonomous driving grid.

The Rivian R1S Adventure will be a hit with electric off-roaders.

Accelerating Into an Electrified and Sustainable World

The collaboration between Databricks and AWS enabled Rivian to scale their capacity to deliver valuable data insights with speed, efficiency, and cost-effectiveness. Rivian is primed to leverage more data to improve operations and the performance of their vehicles to enhance the customer experience. Venkat says, “The flexibility that the Lakehouse offers saves us a lot of money from a cloud perspective, and that’s a huge win for us.” With Databricks Lakehouse on AWS providing a unified and open source approach to data and analytics, the Vehicle Reliability Team is able to better understand how people are using their vehicles, and that helps to inform the design of future generations of vehicles. By leveraging the Databricks Lakehouse Platform, they have seen a 30%–50% increase in runtime performance, which has led to faster insights and model performance.

Shiverick explains, “From a reliability standpoint, we can make sure that components will withstand appropriate life cycles. It can be as simple as making sure door handles are beefy enough to endure constant usage, or as complicated as predictive and preventative maintenance to eliminate the chance of failure in the field. Generally speaking, we’re improving software quality based on key vehicle metrics for a better customer experience.”
From a design optimization perspective, Rivian’s unobstructed data view is also producing new diagnostic insights that can improve fleet health, safety, stability, and security. Venkat says, “We can perform remote diagnostics to triage a problem quickly, or have a mobile service come in, or potentially send an OTA to fix the problem with the software. All this needs so much visibility into the data, and that’s been possible with our partnership and integration on the platform itself.” With developers actively building vehicle software to improve issues along the way.

Moving forward, Rivian is seeing rapid adoption of Databricks Lakehouse across different teams—increasing the number of platform users from 5 to 250 in only one year. This has unlocked new use cases including using ML from AWS to optimize battery efficiency in colder temperatures, increasing the accuracy of autonomous driving systems, and serving commercial depots with vehicle health dashboards for early and ongoing maintenance. As more EAVs ship, and their fleet of commercial vans expands, Rivian will continue to leverage the troves of data generated by their EAVs to deliver new innovations and driving experiences that revolutionize sustainable transportation.


About Rivian

Rivian exists to create products and services that help our planet transition to carbon neutral energy and transportation. Rivian designs, develops, and manufactures category-defining electric vehicles and accessories and sells them directly to customers in the consumer and commercial markets. Rivian complements their vehicles with a full suite of proprietary, value-added services that address the entire lifecycle of the vehicle and deepen their customer relationships.

AWS Services Used


  • Scale capacity to deliver valuable data insights with speed, efficiency, and cost-effectiveness.
  • Flexibility
  • Reliability
  • Design optimization

About the AWS Partner Databricks

Databricks combines data warehouses and data lakes into a lakehouse architecture. More than 9,000 organizations worldwide — including Comcast, Condé Nast, and over 50% of the Fortune 500 — rely on the Databricks Lakehouse Platform to unify their data, analytics, and AI. Databricks is headquartered in San Francisco, with offices around the globe. Founded by the original creators of Apache Spark™, Delta Lake, and MLflow, Databricks’ mission is to help data teams solve the world’s toughest problems.

Published May 2023