AWS for Industries

Tehama leverages Graviton cost efficiency to strengthen business core competency

Introduction

AWS designed Graviton processors aiming to deliver the best price-performance for cloud computing. Since it launched in 2018, many customers of different sizes (start-ups to enterprises) have adopted Graviton for various workloads. Tehama, the world’s first Carrier for Work, has embraced the cost savings and environmental benefits associated with AWS Graviton for their all-in-one platform that combines desktop as a service (DaaS), security, audit, and networking functionalities.

Tehama adopted Graviton technology from a strategic level to greatly improve business core competency. In about 6 months, Tehama has successfully migrated their whole technology stack to Graviton. The end result is that Tehama’s solution can run on both Intel and Graviton platforms, benefiting from the latest advances in silicon technologies.

As commented by Carson White, VP of Operations, Tehama: “By adopting Graviton
we observe a 30% improvement in price-performance and a 29% reduction to the estimated carbon footprint of the Tehama room workload, without compromise to the security, simplicity, availability and capabilities our global customers have come to expect from the Tehama platform.”

Why Graviton?

In any SaaS application, one of the top priorities is maintaining control of your cloud spend.  This is a level of optimization and continuous improvement that is key to the overall success of any SaaS based business.  When Tehama learned that AWS was making Arm-based CPUs available as EC2 instances, at a lower price point, the team started an early investigation and proof of concept to explore the possibilities.  Graviton support by a number of major Linux distributions, including Amazon Linux 2, paved the way for Tehama to quickly migrate existing Linux-based backend infrastructure to Arm based instances.  Tehama was already using Amazon Linux 2, so they chose to stay with that platform for the migration project.  With average 20-40% potential savings available with AWS Graviton-based instances, this was an easy choice.

The secondary goal is achieving cross platform capabilities. At Tehama, they  were interested in adding some cross platform capabilities to key parts of their stack – the elements that end-users interact with behind the scenes.  By adding support for Graviton, they can now switch between Intel and Graviton instances for their workloads, taking advantage of availability and lower costs in each AWS Region.
As well, they also gain an advantage in terms of security, as they enhance the Security for their fundamental Tehama Services by using AWS Graviton processor.
They could also switch the instance type on the fly to avoid possible widespread vulnerabilities in the future.

Migration Journey

First of all, we would like to define what is a Tehama Room. A Tehama room is basically an AWS VPC where Tehama Fundamental Services running and Desktop Instances.
Each Tehama Room nominally consists of:

●      RDS instances (shared between environments)
●      Room Routers (Software Defined Network)
●      Microservices
●      Directories
●      Connectors
●      Jenkins Workers

In production, together that’s approximately 1000 EC2 instances that were targeted for migration.  Because Tehama Room infrastructure is loosely coupled, They could migrate each major component independently.

To kick things off, Tehama started discussions with AWS early on in 2022 to understand the Graviton capabilities and instance details. As commented by Ken Bantoft, Director of Platform Engineering, Tehama: “Direct collaboration with AWS Graviton Specialists throughout the entire migration gave us advantage of getting best practices for tasks such as compiling various libraries and other components”.

Initial migration – RDS & Room Routers

Tehama started the migration journey with RDS migrations, followed by Tehama Room Routers.  The RDS migrations were simple and straightforward – just changing the instance type and AWS handles the rest.

The Room Routers were the first instances Tehama tackled where they needed to make changes to their application. These Routers are EC2 instances that handle network packet processing, as well as some Docker containers for metrics and logging.

The biggest challenge in the migration process was changing the way they built Docker images within Jenkins Pipelines to handle both architectures.  Tehama determined that cross compiling on Jenkins led to longer build times, so they split the pipelines up, added Graviton Jenkins instances and built the Graviton Docker instances natively.

Tehama placed each component behind a Feature Flag, so post deployment they could convert customers one or more at a time.  Once the Tehama Room Router changes were ready in production, they rolled them out to all Tehama customers over the course of a week, with no outages.

Router—- crypto performance testing

Tehama took performance measurements before and after the migration on the Room Routers.  The routers are the most hardware dependent of the instances they use, as routers encrypt all the network traffic to Tehama customer’s environments.

Testing was done on unloaded EC2 instances t3.medium as an X86 instance and t4g.micro as a Graviton instance, using OpenSSL’s benchmark, with both hardware acceleration on and off. The chart below shows the difference in performance:

While in single threaded tests the x86 EC2 instances had higher numbers, once Tehama enabled both CPUs the Graviton performance was significantly better, even on a smaller instance size.  Based on these numbers Tehama felt comfortable about the level of performance available, and proceeded with the migration and deployment to production for Graviton on the Tehama Room Routers.

Directory Servers–the easy part

Next up, Tehama migrated Directory Servers, so all new Tehama rooms will use Graviton Instances.  This process was straightforward, as all custom code was written in interpreted languages (Python, Typescript) and they have sorted out the Docker build issues by this point.  During this stage, they further optimised their build process to parallelize the Docker container build process so it would build both platforms simultaneously.

Microservices–dealing with many dependencies

Tehama then tackled the microservices instances. This involved the most work to-date, as Tehama had to compile about 1% of the open-source components for Arm so they had to undertake that work first. This took several weeks, as they took the opportunity to maintain code hygiene, updating to the latest major versions and incorporating updates to the Docker containers. A key component handled in Tehama microservices includes video processing, so they spend considerable time testing to ensure Tehama will not encounter any functionality or performance issue. Once testing was completed, they deployed the Microservices Graviton capability and then completed migration across all regions within a week.

Connectors— 3rd party module complexities

Finally, Tehama is looking at the Connectors. Connector Migration to Graviton is blocked because connectors contain a 3rd party module that is only available in binary x86 format.  While they await the vendor to add support for Arm-based CPUs, Tehama is looking throughout the rest of the environment for additional components that may be candidates for migration in the future. They are working with the ISV to support Graviton through AWS teams; once this is achieved, Tehama will migrate over another 200+ EC2s to Graviton.

Migration ROI and Cost-Saving Outcome

To date, Tehama has spent approximately 12 person months of effort on the migration, including an initial POC and then the work by both development and DevOps teams to migrate the applications. The effort is indeed smaller than expected, mostly because Tehama found out that among 604 libraries (that they have dependencies) in total, there are only 5 of them that need code changes. All of them were around the build process (i.e., Makefiles, CFLAGS, etc.) for some of the more specialized media libraries that hadn’t been ported over to the arm64 architecture yet.

The effort was indeed smaller than expected, primarily because Tehama found out (discovered) that, among the 604 libraries on which they have dependencies, only 5 required code changes. These changes revolved around the build process (e.g., Makefiles, CFLAGS, etc.) for a few specialized media libraries that had not yet been ported to the arm64 architecture.

The final changesets were about 20,000 LOC, which while it seems high, 99% of it was build automation process code for adjustments to support building both architectures in parallel and updating Dockerfiles for the various containers.  Tehama will reduce this further with some improvements to the Docker image creation workflows in the near future.

According to Tehama’s internal calculation, the migration achieved 30% improvement in price-performance. Considering that this is achieved in 12-person month, the ROI (Return on Investment) is very high. The key to Tehama’s success is the scalability of SaaS software stack. Once fully migrated, the whole Tehama solution could be scaled into 1000s instances.

Carbon footprint reduction

Most Tehama customers, such as enterprises and government organizations, have environmental ESG goals. For those with legacy on-premises VDI solutions, Tehama helps them achieve their carbon emissions reduction goals faster by first retiring the legacy solutions and migrating to Tehama’s AWS-based solution. Studies by 451 Research show that AWS typically produces an 88% lower carbon footprint, and is 3.6 times more energy efficient, than the median surveyed US enterprise datacentre. This reduction will be up to 96% once AWS is powered by 100% renewable energy — a target AWS is on path to meet by 2025. Second, Tehama helps reduce carbon emissions through its migration to energy efficient Graviton processors for its compute workloads. Graviton3-based EC2 instances use up to 60% less energy for the same performance than non-Graviton EC2 instances.

Tehama successfully migrated 100% of the eligible compute workloads to AWS Graviton-based EC2 instances within 3 months. The main services in Tehama Room, like directories, routers and microservices, originally operated on x86-based C5, M5, and T3 instances. These are now in production on Graviton-based C6g, M6g and T4g instances across 11 AWS Regions. Because AWS Graviton-based EC2 instances are more energy efficient than comparable non-Graviton instances, Tehama achieved significant carbon emissions savings for the workload migrated to Graviton. Tehama decreased the carbon footprint of the workload by an estimated 29% due to the migration to Graviton.

Summary

AWS Graviton Arm-based processors delivered a major leap in performance and cost efficiency. Graviton2 saves cost while helping customers work toward their carbon emissions reduction goals. Tehama and AWS will collaborate for continuous improvements by leveraging Graviton3 in the near future for further performance gain, cost savings, and carbon footprint reduction.

References:

Amazon Sustainability: Sustainability in the Cloud:
https://sustainability.aboutamazon.com/environment/the-cloud

AWS Graviton:
https://aws.amazon.com/ec2/graviton/

The Carbon Reduction Opportunity of Moving to Amazon Web Services:
https://sustainability.aboutamazon.com/carbon-reduction-aws.pdf

Cedric Hu

Cedric Hu

Cedric Hu is a Solutions Architect in AWS Canada. He helps clients in manufacturing industry for achieving Industry 4.0 leveraging IoT and Data Analytics technologies. Before AWS, Cedric also had many years experience in Telecom/Wireless networking domain.

Ahmed Elhosary

Ahmed Elhosary

Ahmed Elhosary is A technical account Manager (TAM) with AWS and a member of the Canada East Enterprise support team. Also a member of technical field community for containers . With an insatiable appetite for exploring ways to modernize customer workloads.

Ken Bantoft

Ken Bantoft

Ken has been moving data around the world for more than 20 years – including physical clouds via satellites and aircraft, and virtually between cloud computing providers and telecommunications service providers. At Tehama he plays a key role in securely connecting customers to data, and ensuring the stability of Tehama’s underlying infrastructure.