AWS for Industries

Volkswagen Group’s three success factors for a 6-month cloud migration

Based in Wolfsburg, Germany, the Volkswagen Group is comprised of 10 brands from five European countries: Volkswagen, Volkswagen Commercial Vehicles, ŠKODA, SEAT, CUPRA, Audi, Lamborghini, Bentley, Porsche, and Ducati. By implementing its “NEW AUTO—Mobility for Generations to Come” strategy, the Volkswagen Group will drive its transformation and accelerate its realignment from being a vehicle manufacturer to becoming a major, global software-driven mobility provider—a company that is redefining mobility while doing business in a conscientious, climate-neutral way. The Volkswagen Group’s guiding principle is this: the development of sustainable, connected, safe, and tailored mobility solutions for future generations.

Migrating applications to the cloud is a top priority for many application modernization initiatives at Volkswagen Group. The company’s motivation is to reduce costs, improve agility, and increase scale and availability while heightening the level of security. However, hosting applications on a cloud environment is not always sufficient to unleash their full potential. Often, companies must adapt applications and change corresponding development processes. This was true for the Volkswagen Group’s integration solution, OEM.IL.

Striving to provide a good integration experience, OEM.IL provides backend connectivity for the Volkswagen Automotive Cloud. Serving over four million vehicles, the Volkswagen Automotive Cloud offers services such as online remote updates for the Volkswagen ID vehicle family. OEM.IL is part of a wider undertaking by the Volkswagen Group: the Group Integration Platform (GrIP), which fosters the Volkswagen Group’s transformation as a data-driven company.

This blog post looks at OEM.IL’s 6-month migration to Amazon Web Services (AWS) and explores the migration’s three success factors. These factors are part of the AWS Well-Architected Framework, which provides architectural best practices for designing and operating reliable, secure, efficient, and cost-effective systems in the cloud.

Embracing dual-track agility

For many software projects, teams are concerned about achieving deadlines and delivering the promised cost savings. From the start, the OEM.IL migration team faced many risks and obstacles, so it searched for ways to reduce the risk of delivering later than anticipated. The OEM.IL migration team first decided to work in small, incremental steps. They did not want to conduct a “big-bang” migration—a strategy in which all microservices are migrated and set into production at once. Working in this manner, a migration team faces challenges that typically only surface in production environments at the latest and most critical point in the migration project.

This initial decision turned out to be beneficial for the project because it supported an agile way of working and learning. Looking back at the project, the OEM.IL migration team realized that they could have started reducing risks on the project even earlier. In searching for ways to increase its ability to create more precise estimations, the OEM.IL teams learned about the concept of dual-track agility.

Using this method, a development team performs two parallel tasks in each sprint, as shown in Figure 1. The first task delivers the sprint’s planned results. The second task discovers the knowledge missing to successfully deliver on the goal for the next sprint. By using this method, OEM.IL’s migration team avoided risks and increased software quality throughout the migration lifecycle, while alleviating their initial concerns.

By performing discovery continuously, the migration team was able to quickly find and resolve new issues and test and validate features, thereby gaining more confidence in the cloud services it was using and providing better value to the migration project.

Figure 1: Agility at OEM.IL

Empowering teams

In companies with a more traditional working style development teams are used to the following way of working:

In the sprint planning meeting, they receive a number of stories that they will be working on in the upcoming weeks. They have not seen the stories before, and they have not estimated the amount of work needed to complete them. The most notable point here is that the developers don’t know the greater context or the business goals that the set of stories contributes to.

The persons who created this set of stories often come from heterogeneous backgrounds. Some of the stories come from the product owner. Others might come from other stakeholders, like the security department or management representatives.

Usually, all these contributors create these stories to the best of their knowledge, often describing a technical solution. However, the product manager and the other stakeholders often don’t have the in-depth knowledge to find an optimal technical solution in the first place. Working this way, the product team does not benefit from the huge experience the developers have in creating technical solutions. They also don’t use the opportunity to find better solutions that might take less time to implement.

Additionally, the surrounding organization often doesn’t have clear business goals defined for its products in development. Sometimes departments have slide decks providing vague information, each about its own current business context, and often these slide decks are not self-explanatory. Working that way, every product owner gains a slightly different view of the business context of the organization. In the worst case, this may lead to confusion and frustration. Two product owners might work on competing goals with their product teams. Product owners are forced to hold alignment meetings with management to help everyone understand the whole picture.

The way to deal with product development is to fall in love with the problem and not the solution. Thus, the stakeholders should describe a business problem rather than deciding on the solution to be implemented. By formalizing the business problem by writing it as a business goal, stakeholders empower developers to discuss possible technical solutions. In that discussion, they can consider the technical state of the product and use all of their experience and creativity.

To write user stories, teams need access to clear, unified business goals. The effort needed for creating a set of business goals usually pays off immediately. Department members can refer to these goals and derive actions in their own areas of responsibility. In this way, the whole department benefits from the expertise of the team members in the same way the product team benefits from the technical knowledge of the developers.

A business goal should be self-contained and convey all the information needed to understand it. Besides a title, it should have a description detailing the business context, the customers it is targeting, a set of success measures, an owner, and a due date.

One of the most important parts of a business goal is its set of success measures. This defines what success for the development team looks like. Ideally, a success measure should be automatically measurable. In the OEM.IL migration project, the business goal was to increase availability, so the team chose a success measure of gaining 99.95 percent availability for its infrastructure on AWS.

By deriving user stories from these business goals, the product owners can clearly describe the contribution toward the success measure of the respective business goal. With that information openly available, the development team can take part in finding the best solution. This solution usually is a tradeoff between the technical feasibility of the implementation and the business value it creates.

There are two reasons why working with business goals instead of predefined solutions was beneficial for the OEM.IL migration project:

  1. Successful, agile teams take advantage of the passion and knowledge of all team members, giving them the autonomy to make decisions and mistakes and continuously get better.
  2. Successful, agile development teams can make their decisions much faster than teams that are using a traditional working style.

By working this way, the OEM.IL migration team created a culture of buy-in and ownership. They saw a rise in ideas and contributions coming from the development team, which greatly improved the outcome of the project.

Democratizing advanced technologies

OEM.IL plays an important part in a number of critical business processes of the Volkswagen Group, so availability is vital. One of these critical business processes where OEM.IL is participating is over the air updates of Volkswagen cars. Customers need to be able to download updates for their cars at any time. Therefore, OEM.IL needs to be available 24/7. A lack of availability for OEM.IL would lead to a number of unhappy customers and a considerable loss of reputation for the Volkswagen Group.

The goals of making OEM.IL highly available at scale and building a system that can withstand catastrophic scenarios, such as the loss of a data center, led to the Volkswagen Group’s decision to migrate it to AWS.

When a company is building a highly available system at scale, it is important to address single points of failure. The OEM.IL migration team decided to concentrate on their main task of migrating an integration platform to AWS while utilizing the expertise of AWS to provide highly available cloud offerings for each single point of failure. Figure 2 provides a high-level overview of the OEM.IL cloud infrastructure, showing all four components that could each become a single point of failure. These components are the Domain Name Service (DNS), the load balancer, the OEM.IL runtime, and the databases.

Figure 2: High-level overview of OEM.IL architecture on AWS

For its DNS, the OEM.IL development team selected Amazon Route 53, a highly available and scalable cloud DNS. One of the drivers for this decision was that AWS guarantees Route 53 to be 100% available. This is a benefit for domain names that are fully managed by AWS. Besides the very high availability of Route 53, the team also chose the service because it is provided as a managed service with no need for the development team to donate time to maintain it in the future.

For the application runtime, the OEM.IL migration team chose Amazon Elastic Kubernetes Service (Amazon EKS), a managed container service for running and scaling applications. Amazon EKS is a managed service, which means that AWS takes care of the underlying infrastructure. However, as part of the AWS shared responsibility model, service updates must be implemented by the customer.

Amazon EKS can be configured to automatically fail-over between three Availability Zones, resulting in a very high availability of the applications running on it. An Availability Zone can be seen as an isolated data center within an AWS Region. OEM.IL builders can take advantage of this setup by running at least one instance of an application in each Availability Zone.

The OEM.IL migration team also chose AWS Application Load Balancer as their load balancing solution. The service-level agreement for Application Load Balancer states an availability of at least 99.99 percent. Using AWS ALB, requests can be evenly distributed between the application instances. In the event of an outage of one Availability Zone the AWS Application Load Balancer can detect an issue using health checks and route requests to the application instances running in the other Availability Zones that are still healthy.

In addition, the OEM.IL migration team researched Amazon Aurora, a MySQL- and PostgreSQL-compatible relational database that features a distributed, fault-tolerant, self-healing storage system that automatically scales up to 128 TB per database instance. It delivers high performance and availability with the option to replicate data across three Availability Zones. This is done by creating a virtual Aurora database instance consisting of two physical Aurora instances in each Availability Zone (see “Fast cross-region disaster recovery and low-latency global reads with Amazon Aurora Global Database”). While the client is interacting with the virtual instance, Aurora is routing requests to healthy instances and replicating changes to all instances running on all Availability Zones.

Because the Volkswagen Group is a large automotive company, it needs to provide assurance and make trade-offs that all the AWS services it uses meet its security and compliance needs. So to move fast and tie the infrastructure together, the OEM.IL migration team used the Volkswagen Basic Platform Landing Zone foundation. This helped them onboard OEM.IL to AWS and maintain compliance while still having the freedom to use 99 percent of the services offered by AWS. OEM.IL also benefits from a catalog of internal products at VW designed around AWS Direct Connect, which lets customers create a dedicated network connection to AWS. The Volkswagen Basic Platform Landing Zone foundation uses AWS Landing Zone—a solution that helps customers more quickly set up a secure, multi-account AWS environment based on AWS best practices—and provides well-proven and well-architected services as building blocks for a simple, modular, and individual buildup of domain-specific solutions and application environments. It helps teams quickly deliver an integrated, secure, and compliant foundation for projects across domains, brands, and regions at the Volkswagen Group. The Volkswagen Basic Platform Landing Zone foundation makes use of multiple cloud providers to generate a high business value and to release project teams from foundational heavy lifting.

Volkswagen Group will maintain high standards in future migrations to AWS

Finding a method for quick migrations to AWS will be the catalyst for the Volkswagen Group to realize its “NEW AUTO—Mobility for Generations to Come” strategy. In this post we explored the three key success factors of one such Volkswagen Group migration, embracing dual-track agility, empowering teams, and democratizing advanced technologies.

By using these methods, the OEM.IL migration team has shown how fast you can move to the cloud while maintaining high standards. To learn more about how automotive customers are using AWS, take a look at the aws.amazon.com/automotive/

Daniel Schleicher

Daniel Schleicher

Daniel Schleicher is a Senior Solutions Architect at AWS for Continental, focusing on software-defined cars. In this field he is interested in applying cloud computing principles for automotive applications, and advancing the software development process of automotive applications utilizing virtualized hardware. In previous roles, Daniel led the migration of an enterprise integration platform to AWS at Volkswagen and, as a product manager, contributed to the creation of a central service for the Mercedes Intelligent Cloud.

Sebastian Collins

Sebastian Collins

Sebastian Collins is a Sr. Global Solutions Architect at Amazon Web Services (AWS) for the Volkswagen Group. He has worked in a dynamic range of technical fields from start-ups in Sub-Saharan Africa to global enterprises throughout Europe. Being an expert in the field of cloud computing, he is able to deliver a strong pace of innovation, technical excellence; both broad and deep, and customer obsession through well-established thought leadership.