eeMobility and Infrastructure as Code: A Migration Story
Guest post by:
Steve Behrendt, Lead Software Engineer, Netlight Consulting
Rasmus Pfeiffer, IT & Infrastructure Engineer, eeMobility GmbH
Scott Mullaly, DevOps Engineer, eeMobility GmbH
Aria Omidvar, Software Engineer, eeMobility GmbH
eeMobility was founded in 2015 and is built upon pioneering smarter energy. Our product provides a complete solution for charging vehicles of corporate fleets using 100% certified renewable energy, from installation of charging stations through to invoicing and contract management. Our platform provides customers the simplicity of charging their electric vehicles on charging stations at home, their business, and anywhere in between. We offer a subscription service that provides access to these stations for a fixed monthly fee.
Our backend provides a way of connecting and communicating with a large number of charging stations and operating on a range of different protocols using Rest and WebSocket connections. Charging processes are initiated by customers through RFID cards and our Swift iOS/Kotlin Android app (developed in-house), before reaching our system where the entire charging process is managed. Furthermore, the system produces billable charging data for efficient invoicing to customers. In addition, we host a React App to support internal office tasks such as Customer Service, Station Operation/Serviceability and Contract Management.
All in all, our core system is made up of 19 microservices that together form a highly reliable distributed application. These services are Spring Boot 2.1 applications using Java 11, persisting with PostgreSQL and Hibernate, built with Gradle.
Why Amazon Web Services
Not long ago, we were fully dependent on an external team who decided how and where to run our software. Initially this allowed us to focus on developing our very young software solution and deliver feature after feature. However, as both the platform and the desire to become independent evolved we knew an infrastructure solution tailored to fit our needs was a must.
When the time came, we faced a critical question to which none of us had a definitive answer: Which cloud provider suits our needs best? After comparing the leading cloud solution providers, we found out that AWS suits our needs best. It has the largest feature set, is well documented and easy to use, particularly for a software team with little to no prior DevOps experience. Finally, it has a data center region inside Germany which removed our biggest data sovereignty concerns.
Now that it is clear why we picked AWS, let’s take a look at what AWS services we are using and what our set up looks like.
AWS Account Setup
We use an organizational setup of four parallel, isolated environments (Live, QA, Development, and Playground) in separate accounts, which serve as member accounts to a master account.
In each of the member accounts, we define administrator, developer, and automation IAM roles. These roles are assumed by different users accordingly. The administrator role controls adding users to the master account and ensures what they can do is handled via the roles in the member account – through permissions on role level. The developer role has only the permissions it needs so that developers can get their job done, while preventing unintended resource creation and deletion.
Infrastructure as code (IaC)
The entire infrastructure to support our platform is scripted in CloudFormation across all accounts and environments. Every change to any piece needs to be tested, committed, code reviewed through pull requests, and successfully run through our BitBucket pipeline; sounds tedious and long, the opposite is true.
Infrastructure is not a monolith. There are cross-cutting components like VPC for network setup, RDS, autoscaling groups and loadbalancers, but also application-specific infrastructure parts in the name of ECS, Route53 entries and ALB Listener Routes.
The infrastructure code is split into common and application-specific parts. The common infrastructure is broken down into auth, hosting, network, queue, secrets and role stacks, in its own git repository and has a dedicated pipeline to deploy it. Every Service has its app-specific CloudFormation stack that gets deployed during its own app pipeline. We deploy fundamental changes in the common stack before we deploy the app that may rely on this infrastructure.
Pro tip: make small changes to the infrastructure with CloudFormation. Big changes take long.
In the future, we will extract roles and permissions into its own repository to deploy it in isolation. We ran into a few issues where we could not rollback successfully after a failed infrastructure deployment, because the permissions got removed before the infrastructure components got removed again and we ended up in a deadlock.
ECS and AWS infrastructure
After all this talking, let’s dive into how it actually looks like for us. To run each Spring Boot microservice we create an ECS service and deploy it into our cluster. The Cloudformation app-specific stack uses services such as S3, EBS, EFS, ALB, IAM, Route 53, Aurora RDS, SNS, MQ, Lambda, CloudFront and CloudWatch.
All our EC2 cluster instances that we use for our applications, run inside a Private Subnet. We run in one region with multiple Availability Zones so that all apps fulfill high availability requirements. Internal service communication is through private ALB and AWS service discovery. External communication happens through the NAT and Internet Gateways in the Public subnet.
We have four further supporting services to our Spring Boot apps in the form of Elasticsearch, monitoring and aggregated logging infrastructure. These four services are also deployed into the cluster meaning our entire range of services are highly available, managed by ECS.
Our Biggest Challenge
Some of our services need to be accessed both internally and externally. For internal load balancing we use an ALB with routes and DNS entries for each service which is not accessible from the internet. For external load balancing we use another ALB.
With this setup we encountered the problem that a single ECS service can only be attached to one Loadbalancer TargetGroup using Cloudformation (at the time of the implementation). However, with AWS SDK being available, we could write a Lambda function, to do exactly that. So, we created a second empty TargetGroup and the Lambda function was triggered by the service starting/stopping which registered the instance at the second TargetGroup and therefore, allowed internal and external communication for the single service.
Interestingly, this functionality has since been introduced and is now supported by Cloudformation. The Lambda function is no longer needed, and we have migrated to using native Cloudformation for multiple TargetGroups.
We did a fair amount of trial and error to find our most desired implementation and now that we are happy and confident with our infrastructure, we plan to optimize it. Cost Engineering through scheduling and autoscaling are two key aspects we will focus on and we have no doubt and the many AWS provided services make that possible. We are excited to continually utilize more and more AWS services to assist our rapidly growing platform.
Further note: This blog post is part of the collaboration of Netlight Munich and eeMobility GmbH in the emerging field of electric charging. Together, we build a system to enable corporate fleets to use renewable energy and pioneer in the field of smart charging.