AWS for Industries

In the News: The Power of the Cloud

This article originally appeared on Automotive News.

Cloud computing allows Autonomous Vehicle programs to focus on speed and building value

Big achievements require big investments, and according to research firm PitchBook, autonomous vehicle (AV) startups spend $1.6M every month on average.

The lion’s share of that cost goes into testing and the infrastructure needed to support it. A single real-world autonomous prototype vehicle, outfitted with sensors and data recording gear, can cost up to $500,000. The cost of processing and storing all the data generated isn’t too far behind.

In 2017 carmakers and tech companies alike, such as GM, Hyundai, Volkswagen, and Waymo, were predicting self-driving cars would arrive by 2020, and spending big to make it happen. From August of 2014 to June of that year, automakers and tech companies invested $80B in AV technology according to a study by the Brookings Institution.

Only a year later, accidents involving self-driving prototypes seemed to cause developers to lose momentum. At the same time, the technical challenges required for Level 5 autonomy – full self-driving vehicles – have become ever more complex. By 2019 many developers were backing off of their 2020 plans.

The path to Level 5 is still ongoing and AV developers are still working on the societal improvements it will bring; but they also have some new priorities. They’ve also found new opportunities in cloud computing – which can slash costs and development times.

First, they’re refocusing on near-term paths to marketable products and services that mesh with their long-term goals: trucking and logistics, automated driver assist (ADAS) systems, and providing data insights. Some developers are building value and demonstrating real-world efficiencies right now.

“Companies like Wejo have hidden gems in their data,” says John Barrus, global business development manager for autonomous technology at Amazon Web Services (AWS). “They can partner with municipalities and provide vital information to maintain infrastructure, mapping assets like stop signs or identifying potholes and road closures. They’re just refining data they capture into value.”

Similarly, Level 4 autonomous trucking startup TuSimple launched an autonomous freight network in July. The company is demonstrating the kinds of safety and efficiency gains to be made from autonomous operation on paid cargo hauling while it refines its product in real-world testing.

Second, Barrus says, AV developers are looking to economize their existing workflows by doing more testing in virtual environments and speeding up processing and storage.

In both cases, cloud computing provides opportunities to speed up and scale development with great flexibility while making data more accessible for those nearer-term goals. COVID-19 has further accelerated existing trends toward cloud solutions.

After 2018 it became standard practice to have two people ride in every on-road AV test vehicle for safety reasons. Social distancing rules make that much more difficult, which means many AV testing fleets are currently parked.

“Many of our AV customers can’t test in the real world right now and want to do more simulation,” says Nisarg Modi, AWS’ head of worldwide business development. However, Modi adds, “If you increase your reliance on simulation but don’t have computing infrastructure ready, you’re going to burn a lot of cash.”

Barrus says that AWS sees itself as a partner for those who need computing infrastructure. “AV startups don’t need to be setting up their own hardware for computing or storage,” he adds. “It’s much more efficient to let a partner provide those pieces and turn those tasks into API calls.”

Bite size computing

AV development involves gathering large amounts of data from real-world testing and simulation. Then comes training – the data is then fed into new behavioral models for prototypes, which are then tested in simulation and validated again, generating a new cycle of data.

“Simulation is one of the key ways we improve the safety of our software before it goes anywhere – even on a test track,” says Timothy Perrett, senior staff engineer at Lyft Level 5. It’s a process that demands flexibility.

Lyft Level 5 was launched in 2017 to give the ride-sharing company a direct buy-in to the advantages of AV technology, with an eye towards autonomous ride-sharing vehicles. Using petabytes of data gathered from its AV fleet, Lyft’s engineers run millions of simulations each years to improve the performance and safety of it’s self-driving system.

The validation effort and being able to run all this data is a huge problem at scale, as are the vast quantities of data generated in simulation. Lyft itself has used AWS as a cloud partner since 2012, but using the cloud for interacting with customers in the field and using for simulations and data processing are not the same.

“Level 5 has different needs and constraints,” says Perrett. “Most of our compute needs are in servicing large, batch-style workloads that have a very spiky profile. We need the ability to burst up to high peak loads and then quickly turn everything down when we’re not using it.”

The solution lay in using spot instances. “When we experimented with running on Amazon EC2 (Elastic Computing) Spot Instances, we realized that as our program was growing quickly, there was an opportunity to significantly reduce our operational costs,” says Perrett.

Spot instances are on-demand allocations of computing capability, part of Amazon’s Elastic Compute Cloud (EC2). Users pay only for what they use, with no limit on scale.

They also allow jobs to be packaged together to run concurrently. That can speed up the processing of large batches of data and it means no individual team or engineer has to sit idle while another gets priority computing time. By adapting to the spot instance structure, Perrett says, “We became smarter about how we allocate work and how we relocate jobs in a given resource pool on a given day.”

Through carefully partitioning and directing its simulation traffic, Perrett’s team reduced the cost of simulations to just pennies for each execution. “About 77 percent of our computing fleet across all Level 5 workloads is now on AWS EC2 Spot instances,” he says, “And the cost savings overall has been around two-thirds.”

On the road now

Level 5, full self-driving autonomy, might be the ultimate goal of all AV developers, some companies are already leveraging their technology in revenue service today.

“We launched the world’s first autonomous freight network on July 1,” says Jason Wallace, marketing director for AV trucking startup TuSimple, “We’re currently shipping all the way from Phoenix to Fort Worth with UPS.” The company is targeting nationwide operation in 2024.

TuSimple’s trucks are already operating at what’s commonly defined as Level 4 autonomy. They can drive themselves, but primarily within an environment they were created for.

Trucks generally move from warehouse to warehouse on highways that are well marked and maintained. That’s easier for AV systems in theory, but big rigs are 70 feet long, hard to maneuver and stop, and typically have less than 3 feet of clearance on either side in their lanes.

All of the company’s trucks have a human at the wheel, and both humans and computers drive on different occasions. When human drivers are in control, the system records the driver’s behavior and the decisions the AI would have made, which are then analyzed through machine learning.

“We’ve got more than 50 trucks now, and every one comes back from a mission we’re updating massive amounts of data into the cloud,” he adds. “We need to be able to transfer that data quickly and securely, and AWS has really helped there.”

Early on, TuSimple would collect data on its own hard drives and then undergo a 15-hour process for transferring the data to the cloud. In 2018, it added AWS Snowball Edge devices that can record, label, and compress the data while onboard the vehicle. That saves the time of having to do those tasks after a run and also helps reintegrate new driving models from the cloud.

The data is transferred to and from the cloud via AWS Direct Connect, a dedicated, secure, high-speed data connection, at much faster rates than before.

Like Lyft Level 5, TuSimple uses spot instances and was able to reduce its simulation and analysis times from weeks to hours, which is in part how it was able to go from a single test truck in 2018 to more than 50 hauling real freight by mid-2020. The company still has its own data centers, Wallace says, “But we grew so rapidly that we needed a partner to scale what we were doing.

Thanks to real world and simulated testing, Wallace says, TuSimple is already demonstrating to clients the capabilities of the trucks.

The company’s control algorithm can keep a truck at the center of its lane with 4 cm accuracy at 65 mph with a loaded trailer. It’s also yielded 10 percent better fuel economy on trips through more precise throttle and directional control than humans.

Those savings are extremely important to customers like UPS, McLane, and the United States Postal Service, who are using the system right now. “When COVID-19 hit, we did not stop operations because we had contracts to fulfill and our model allowed for social distancing.”

Storing and complying

The vast amounts of data TuSimple collects are also retained in the cloud and indexed for future use, Wallace says, a key part of regulatory compliance. “We have our own storage capabilities, but it’s not suitable for the scale that’s required,” he adds. “You have to have confidence that the data is stable, secure, and accessible for as long as you need it to be.”

“When you create a machine learning model that makes decisions like how a car is operated,” AWS’ John Barrus says, “you need to be able to make the data that went into it available for audit at a later date.” Opinions on how long that data needs to be kept vary, but the general rule of thumb is that it needs to be retained for 10-15 years, and it needs to be easily available.

AWS makes it possible to store all of a company’s data in one central location – a “data lake.” The “data lake” reference architecture was specifically developed by AWS after working with numerous customers on the challenges they faced managing their Autonomous and ADAS data.

In a manner similar to spot-instances, it makes various levels of storage available for using and retaining data in a tiered pricing system. Fresh data is usually stored on Amazon S3 servers. Dormant data can be moved to Amazon Glacier, which is very low cost but has slower access. “It’s still more cost effective and accessible than a tape solution,” says AWS’ Nisarg Modi.

The road to Level 5 might be long, Modi says, AV developers have opportunities to fine-tune their operations and build value on the way. “The best play is a long term strategy. How do you scale your operations? How do you manage your cash in what may be tough times going forward?” The cloud can help, he adds, and so can a partner with cloud infrastructure.

To learn more about how AWS is helping companies accelerate development of autonomous driving systems, please download our AV ebook.