Customer Stories / Automotive / United States

2023
Company Logo

Automating ETL and Improving Customer Personalization Using Amazon Kinesis Data Streams with TrueCar

Learn how TrueCar, an online automotive marketplace, reduced the wait for clickstream analytics from 4 hours to 5 minutes using Amazon Kinesis Data Streams.

48x improvement

of time to insight for clickstream analytics

Automated

the extract, transform, load process

Increased

observability

Improved

customer personalization

Overview

TrueCar wanted to reduce data latency and drive analytics insights to improve performance and customer personalization. On its third-party legacy solution, the large US-based car-buying-and-selling marketplace saw data latency of 24 hours. The company performs dozens of A/B tests per day and with its legacy solution had to wait for those insights. To improve the time span between data ingestion and business insights and analytical queries, TrueCar turned to Amazon Web Services (AWS).

To control its own data, reduce latency, and streamline operations, the company uses Amazon Kinesis Data Streams, a serverless streaming data service that makes it simple to capture, process, and store data streams at virtually any scale. TrueCar is actively updating its analytics solution and has already improved the speed to insight of clickstream analytics by 48 times; automated its extract, transform, load (ETL) process; and improved observability into its metrics thus far in its journey.

237097132

Opportunity | Using Amazon Kinesis Data Streams to Reduce Latency and Streamline Operations for TrueCar

Founded in 2005, TrueCar displays pricing and information on new and used cars from over 11,000 dealers, and it helps customers to get instant offers to sell their used cars. With the immense number of vehicles on its website, gaining quick insights and providing personalized results for customers is important. TrueCar wanted to gain insights quicker and to detect anomalies or problems with the performance of specific features because it does continuous integration and continuous delivery (CI/CD) deployments on the front end and backend. With the third-party analytics provider that TrueCar used, the company was not in control of its web traffic and it was difficult to add or remove data fields. Additionally, with the high data latency, the company was subject to the timelines of this third-party provider.

In 2017, TrueCar started its journey using AWS and Amazon Kinesis Data Streams, but it still ran its analytics on its previous solution. “We have run our infrastructure completely on AWS for almost 5 years,” says Anil Gupta, distinguished engineer at TrueCar. “It was not a natural fit that our biggest dataset was originating outside that infrastructure.” TrueCar used Amazon Kinesis Data Streams to build more interactive applications and reduce latency for updates in its business-critical inventory application from hours to minutes. At the end of 2019, TrueCar began running proofs of concept with several potential solutions, searching for one that was built for data streaming.

To pair its use of Amazon Kinesis Data Streams with a near-real-time analytics database in 2020, TrueCar chose to work with Imply, an AWS Partner, which deploys the Apache Druid database. “With a proof of concept, we were able to show that we can write data in near real time, ingest it, then get results within a few seconds,” says Gupta. “There was no need for extra support, and Apache Druid supported native Amazon Kinesis Data Streams ingestion.” TrueCar’s new solution for clickstream analytics takes information from its data sources using a data router, writes the data to Amazon Kinesis Data Streams, then runs the Spark Streaming process to ingest and process data in near real time and writes the data to Apache Druid. With up to 25 million rows ingested in Apache Druid daily and 10 billion rows on its clickstream table, the new solution had a significant impact for TrueCar.

kr_quotemark

Using Amazon Kinesis Data Streams provides data to the appropriate teams in a consumable manner and reduces all friction points.”

Anil Gupta
Distinguished Engineer, TrueCar

Solution | Improving Time to Insight for Clickstream Analytics by 48 Times Using Amazon Kinesis Data Streams

With this new solution, TrueCar runs its ETL code through Spark Streaming. Data incoming from the website is pulled through this process every 10 seconds, automating the ETL process, which saves the company time. The data now flows directly between Amazon Kinesis Data Streams and Spark Streaming, and the ETL happens in near real time instead of being done after exporting data, something that was not possible for TrueCar before this solution.

The biggest time-saving benefit for TrueCar was in time to insight for clickstream data. “We used to have a pipeline to query our clickstream data that took 4 hours and had 50 processes with heavy lifting being done in the Spark Streaming cluster,” says Gupta. “We went from that to running a query in 5 minutes.” This is a 48-times improvement in the time to insight for clickstream analytics for TrueCar. The company runs dozens of A/B tests daily, and reducing the latency of data from 24 hours to a few seconds translates to a greatly improved velocity of iteration on its product features.

The company can also now support its CI/CD deployments and A/B testing in near real time instead of over 1 or 2 days. As a result, TrueCar can serve its customers faster. TrueCar can catch and fix issues quickly with CI/CD deployments, a way of developing software to release updates at any time in a sustainable way, because there is minimal data latency. The company’s time to reaction is exponentially shorter. “This solution improves observability for things outside of revenue,” says Gupta. “We can look for anomalies in the metrics not being driven by marketing and sales.” TrueCar can see web insights with a latency of a few seconds rather than 24 hours and can embrace a more granular mindset when it comes to analytics. In addition, this new solution has native support of JSON, which is helpful with web analytics data.

Having a quicker understanding of user behavior means that TrueCar can improve personalization for its customers, the next big step in its journey. After a user’s first session on the website, the company uses its enhanced analytics to personalize the user’s experience while the user is offline. TrueCar can observe customer responses to A/B testing almost instantly and can make adjustments quickly. “We are using all the tools that we have, like Apache Druid and Amazon Kinesis Data Streams, to start personalization offline,” says Gupta. “We analyze the data based on user actions, then have some metadata to support that user when they next visit our site.” TrueCar is using its new solution to unlock near-real-time data insights.

Outcome | Innovating with Analytics Using Amazon Kinesis Data Streams

TrueCar is still working to migrate fully off its legacy solution in 2023 and is happy with the new state of its tech stack. The company is also continuing its journey and innovating its customer personalization using Amazon Kinesis Data Streams and Apache Druid, which is possible because of the company’s new streaming data architecture. It sees the potential of applying near-real-time analytics to user sessions on the website to personalize interactions for TrueCar customers within seconds.

With the new solution, the company has unlocked near-real-time data insights and can react quickly. Additionally, it can look at data analytics without worrying about latency. “Using Amazon Kinesis Data Streams provides data to the appropriate teams in a consumable manner and reduces all friction points,” says Gupta.

About TrueCar

TrueCar is a large online marketplace for automotive buying and selling in the United States. The service has both new and used vehicles, was founded in 2005, and has over 11,000 dealers on its solution.

AWS Services Used

Amazon Kinesis Data Streams

Amazon Kinesis Data Streams is a serverless streaming data service that makes it easy to capture, process, and store data streams at any scale.

Learn more »

More Automotive Customer Stories

no items found 

1

Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.