Using AWS to Build Tools That Will Design Tomorrow’s Green Infrastructure
Guest post by Connor Philip, DevOps Engineer Continuum Industries, with Amir Majlesi, Principal Manager at AWS EMEA Prototyping Labs and Bigad Soleiman, Lead Prototyping Architect
Continuum Industries is an Edinburgh-based startup developing AI tools that enable engineers to rapidly create and explore design options for infrastructure projects. We’re on a mission to help accelerate the path to Net Zero by speeding up the early-stage design of large, complex infrastructure projects. To date our web application, Optioneer, has successfully helped support the design of major water and offshore wind transmission projects. The projects we work on are built-in and influence the real world; they help secure our fresh-water supply and help connect offshore wind farms to the mainland.
Optioneer employs multiple AWS services and is updated daily thanks to our continuous deployment pipeline. However, as it grew in complexity, we realized we needed to overhaul our deployment process to allow us to deliver more reliable updates to production, faster. We had been aware of some issues but were reluctant to start making changes. Our concern was any change would become a big resource drain on our devs and drag on after multiple deadlines.
That is when the AWS startup team provided additional support. When we explained our challenges, and they suggested working together with AWS Prototyping Labs to build a system to handle canary deployments with our current tech stack. A prototyping engagement with AWS is focused on co-development of an end-to-end solution together with AWS experts whereby we can rapidly design, build and test in an agile environment following AWS best practices. We agreed on a timeline of 5 weeks. While we weren’t sure how much we could achieve in such a short span of time, we were excited about the prospect of working with AWS experts.
Developing the prototype
Our team of three, a DevOps and AI engineer from Continuum Industries and a prototype architect from AWS started implementing the canary deployment system by slotting it into our current deployment cadence whilst smoothly fitting in the validation of complex configurations only available on our production system. It was important to us to take full advantage of the prototype architect support to learn best practices when using AWS to build this system.
Daily stand-ups between the CI and AWS team members and weekly reviews of progress helped us make adjustments in real time and share learnings that other team members could utilize in their own tasks.
Once we determined the core components which provided more confidence in our vision, progress quickly picked up. The individual pieces of the puzzle started to come together:
- Our DevOps team member created the deployment workflow.
- Our AWS prototyping architect created the core business logic.
- Our AI engineer put forward the overall Step Functions design.
A prototyping approach helped us push forward quickly, with the AWS prototyping architect guiding us through design choices and demonstrating how various AWS services would work together to meet our unique requirements.
Our AWS solutions architect suggested we use AWS Step Functions, a low-code, visual workflow service, which became the core of our new deployment system. Its built-in ability to orchestrate AWS Lambda functions, Amazon Elastic Kubernetes Service (Amazon EKS) job executions and various AWS services enabled a custom, flexbile integration with our system at every level. Its modular nature makes it easily extendable with the potential of being service-tailored as needed. During development we were also introduced to AWS Step Function Workflow Studio, which was a vital visual tool in the process of implementing workflows as well as including them in our Infrastructure as a Code (IaC) framework, Terraform. The workflow we designed deploys the latest version of our AI service to our production cluster, where we shift traffic over to the new version at a more efficient pace. Throughout this process, we also perform a set of custom checks against on our services, leveraging AWS Cloudwatch, to ensure it’s safe to proceed. Once 100% of the traffic is routed to the new service, we remove the current stable version and replace it with the updated stable one.
When the new version deployed successfully, we initiated the “config validation process.” This automatically generated a schema for the configuration our AI service expects, and verified that configurations already present in our production database weren’t in violation. Were that the case, a Cloudwatch Alarm would fire off, the rollback workflow would kick in, and a detailed report with useful links to logs and a dashboard would be sent to our internal dedicated Slack channel to ensure maximum visibility and swift debugging. The integration with Slack was made possible thanks to an extendable notification module built on top of Amazon Simple Notification Service (Amazon SNS), Amazon Simple Queue Service (Amazon SQS) and Lambda.
These workflows enable our engineers to perform safer deployments and migrations without impacting the production stack and act immediately with relevant data at hand when needed.
Together with the AWS prototyping team, we successfully co-developed a working prototype within the five-week goal. This partnership provided a scalable architecture our team could build upon, allowing us to iterate faster and have greater control for running different versions of the algorithms for different users. The serverless architecture is based on Step Functions, aiming to orchestrate several small, reusable logics in order to provide full flexibility to meet bespoke deployment requirements but also allowing future extensions.
The AWS Prototyping Team didn’t just help us build the architecture, they transferred their knowledge in a way that empowered us to continue developing on our own. At the end of the five weeks, we achieved our goals:
- Our new deployment system is capable of detecting issues within the service and production configurations during deployment and roll back when required.
- We shared the knowledge gained and learned a lot from the insights AWS prototyping architect gave us.
- We went further and managed to add a robust monitoring and alarm system during the prototyping period.
This success wouldn’t have been possible without the team having a prototyping mindset from the outset and without the great help from our AWS prototyping architect. Adopting a prototyping mindset encouraged us to:
- Communicate often and ask lots of questions,
- Adapt our original plan when circumstances change,
- Have an agile mindset to rapidly design, build, test and learn from failures,
- Always think about the end goal when prioritizing tasks,
- And step out of our comfort zones.
If you find yourself stuck in a situation where you know a change is required, but you’re not quite sure of the correct solution, I recommend getting the AWS prototyping team together to focus on the task.
Connor Philip: Connor is a DevOps Engineer at Continuum Industries. He develops the systems that allow the entire technical team to create tools which will design tomorrow’s green linear infrastructure.
Amir Majlesi: Amir is a Principal Manager at AWS EMEA Prototyping Labs. He supports customers with exploration, ideation, engineering and development of state-of-the-art solutions using emerging technologies such as IoT, Analytics, AI/ML & Serverless.
Bigad Soleiman: Bigad is a Lead Prototyping Architect with an extensive software engineering background at Amazon Web Services. Leading customers through tough business challenges involving Serverless, DevOps, and other cutting edge technologies.