AWS for Industries

Xcel Energy migrates CI/CD to AWS Fargate for 60x faster deployments at 82x lower cost

Xcel Energy logo

Digitization is transforming the energy industry. However, at Xcel Energy, a major US electricity and natural gas company, legacy technology infrastructure and a reliance on cloud-service contractors were limiting the benefits of digital development. Jared Keith, DevOps lead, joined Xcel Energy in 2019 to help launch new solutions in Amazon Web Services (AWS) and later transferred to platform engineering to streamline infrastructure, reduce costs, and increase efficiency throughout the company’s cloud environment.

Xcel Energy is a leader in the clean energy transition to a more sustainable future. In December 2018, Xcel Energy was the first major US energy provider to commit to delivering 100 percent carbon-free electricity by 2050. Critical to this transition are the digital solutions its developers create for all types of energy production (wind, hydro, solar, nuclear, coal, and natural gas), as well as drone fleets and field crews. Digital solutions require efficient continuous integration and continuous delivery (CI/CD) pipelines that allow source code to be built, packaged, tested, validated, and verified quickly and easily. When Keith joined the platform engineering team, he found that his team was maintaining a sprawling and cumbersome CI/CD infrastructure that couldn’t scale.

Xcel Energy teams were using the open-source tools Spinnaker and Jenkins to control complex Kubernetes deployments on persistent servers in Amazon Elastic Compute Cloud (Amazon EC2), a broad and deep compute platform. The teams required knowledge of custom languages like HAL and Apache Groovy, which even advanced DevOps engineers struggled to learn. The small community of online and contract HAL and Groovy developers were too expensive and unavailable for Keith’s needs. His CI/CD pipelines ran only sporadically at certain hours, which Keith noticed could be a fit for serverless computing.

Serverless technologies allow you to pay only for what you use and greatly reduce the overhead of scaling, patching, securing, and managing servers. To simplify Xcel Energy’s CI/CD infrastructure, Keith turned to AWS Fargate, serverless compute for containers. Using AWS Fargate, developers can run containers in Amazon Elastic Container Service (Amazon ECS), a fully managed container orchestration service, and Amazon Elastic Kubernetes Service (Amazon EKS), a managed Kubernetes service to run Kubernetes in the AWS Cloud, with automatic scaling and built-in high availability. “I was paying a lot of money for long-lived Amazon EC2 instances that were sitting idle waiting to accept jobs,” Keith said.

Keith exceeded his expectations. “After migrating to AWS Fargate, we saw our jobs run 6,000 percent faster and our monthly infrastructure bill drop by 8,200 percent. I was astonished that our monthly server bill went from several thousand dollars to less than $100. What we had measured in weeks all of a sudden became days or even minutes. I cannot overstate how transformational this has been.”

Beyond speed and cost: Serverless containers strengthen security and facilitate effortless scaling

Cost and speed weren’t the only gains. “When I first built the infrastructure, I was supplying one team of about 30 people with a pipeline. Now, I’m serving 700 engineers and about 40 teams.” Even with the larger workload, Keith’s turnaround time is dramatically faster. It required effort to set up the automation, but a software build that previously took 15 minutes to deploy takes just 15 seconds today.

Everything was rebuilt in GitLab using declarative languages (like YAML and Terraform), which are easier to learn, use, and template. Moving everyone from Jenkins to GitLab, where everything runs as a Docker image, eliminated several previously necessary steps. Images can be prebuilt and deemed secure without having to build out dependencies. A developer downstream simply pulls a prebuilt image into their code and it runs.

Scaling up is also faster and simpler. Prior to using AWS Fargate, if Keith wanted to add pods or nodes, his team would spend about a half day editing Terraform code and testing changes. “Using AWS Fargate,” Keith reports, “our worker runners request to run jobs, our cluster asks for more pods, and AWS Fargate provisions the pods into their specific namespaces. If the pod can’t fit into existing cluster nodes, AWS Fargate pulls a new node, and within 2 minutes a node is attached, and our pod is running on it.”

Such dramatic speed and scale improvements often raise security concerns, but Xcel Energy noted several security improvements after migrating to AWS Fargate. For instance, when a job is called through a branch update, the temporary pod is provisioned in AWS, it runs its tasks, and then it terminates when finished. The pod is gone forever; Keith’s team is no longer monitoring logs on idle, persistent infrastructure to protect against attack.

AWS Fargate also added a layer of identity and access management (IAM) security. Previously, all authentications had to be done in pipeline scripts, which allowed security engineers and administrators to potentially view credentials that they didn’t need access to. Now, when a developer runs a command, they don’t have to manage authentication. Integration of AWS Fargate with IAM roles allows AWS to handle secure machine-to-machine connectivity so the scripts can connect directly, simultaneously speeding up jobs because the scripts no longer contain authentication steps.

“Altogether,” Keith summarizes, “with these improvements to authentication, persistence, scaling, supportability, and cost, we have created a smoother, more secure developer experience.”

Transforming disaster recovery, streamlining operations, and accelerating change

By using AWS to streamline its CI/CD pipeline, Xcel Energy is more confident in its disaster recovery capabilities. If a cluster fails, the serverless environment enables Keith’s team to spin up all 12 GitLab runners in just 23 seconds using parallel initiations and horizontal scaling. Being able to recover his entire runner fleet in under 1 minute is transformational for Keith, who would’ve spent 1–3 weeks rebuilding his previous environment.

Keith is now maintaining CI/CD infrastructure for all of Xcel Energy, and he is enthusiastic about the “sheer ease of use” of AWS Fargate. Supporting 40 teams wouldn’t have been possible in his previous environment. “That’s only the tip of this iceberg,” he said. “There is so much more that I can do now. It is awesome to see the possibilities open up.”

These innovations have made Xcel Energy’s infrastructure more reliable while also dramatically reducing its cloud spend, which is savings the company can pass directly to its customers. Its development environment continues to get faster and better, and Keith gets to focus on more challenging work to facilitate change. “I’m a platform engineer, so my day should be spent platforming things. By adopting AWS Fargate, I can focus on innovation. The growth of digital solutions at Xcel Energy is reducing overhead costs, increasing agility, improving security, and helping us deliver cleaner energy to our 3.7 million electricity and 2.1 million natural gas customers.”

Xcel Energy set an aggressive interim target to reduce carbon emissions by more than 80 percent by 2030. As Xcel Energy accelerates toward its goal of becoming an overall net-zero energy company by 2050 across the eight states that it serves, AWS is helping by powering operations with 100 percent renewable energy by 2025 as part of Amazon’s overall commitment to achieve net-zero carbon by 2040.

Learn more

See how Vanguard uses Amazon ECS and AWS Fargate to increase investor value.
Or watch a video highlighting Taco Bell’s success with AWS serverless technology.

To learn more about how AWS is helping transform the energy industry and optimize businesses, visit the AWS for Energy page.

Jared Keith

Jared Keith

Jared is a Lead DevOps Engineer at Xcel Energy where he has led IT infrastructure and automation teams for the past three years. He loves embracing new technologies that lead to massive shifts in the way people work and communicate. Jared has fifteen years of software experience in roles bridging QA, build/release, tooling, SRE, DevOps, and platform engineering. He enjoys working, teaching, and empowering others to use Kubernetes. In his spare time he likes to make salsa; the spicier the better.

Tom Lauducci

Tom Lauducci

Tom is a Solutions Architect at AWS where he provides technical guidance to support Xcel Energy's cloud transformation. He is passionate about sustainability, renewable energy, and efficiency, and loves helping AWS customers innovate in those domains. Tom's roles in 10 years at Amazon have varied from designing the workstations and robotics you see on an Amazon Fulfillment Center tour to designing physical security automation tools for AWS's data center fleet. He holds a BS in mechanical engineering and maintains 6 active AWS certifications.