Insights for CTOs: Part 1 – Building and Operating Cloud Applications

In my role as a Senior Solutions Architect, I have spoken to chief technology officers (CTOs) and executive leadership of large enterprises like big banks, software as a service (SaaS) businesses, mid-sized enterprises, and startups. In this series, I share insights gained from various CTOs and engineering leaders during their cloud adoption journeys at their respective organizations. I have taken these lessons and summarized architecture best practices to help you build and operate applications successfully in the cloud. This series will also cover topics on cloud financial management, security, modern data and artificial intelligence (AI), cloud operating models, and strategies for cloud migration.

Optimize cost and performance with AWS services

Your technology costs vs. return on investment (ROI) is likely being constantly evaluated. In the cloud, you “pay as you go.” This means your technology spend is an operational cost rather than capital expenditure, as discussed in Part 3: Cloud economics – OPEX vs. CAPEX.

So, how do you maximize the ROI of your cloud spend? The following sections provide hosting options and to help you choose a hosting model that best suits your needs.

Evaluating your hosting options

EC2 instances and the lift and shift strategy

Using cloud native dynamic provisioning/de-provisioning of Amazon Elastic Compute Cloud (Amazon EC2) instances will help you meet business needs more accurately and optimize compute costs. EC2 instances allow you to use the “lift and shift” migration strategy for your applications. This helps you avoid overhead costs you may incur from upfront capacity planning.

Figure 1. Comparing on-premises vs. cloud infrastructure provisioning

Containerized hosting (with EC2 hosts)

Engineering teams already skilled in containerized hosting have saved additional costs by using Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Container Service (Amazon ECS). This is because your unit of deployment is a container instead of an entire instance, and Amazon EKS or Amazon ECS can pack multiple containers into a single instance. Application change management is also less risky because you can leverage Amazon EKS or Amazon ECS built-in orchestration to manage non-disruptive deployments.

Serverless architecture

Use AWS Lambda and AWS Fargate to scale to match unpredictable usage. We have seen AWS SaaS customers build better measures of “cost per user” of an application into their metering systems using serverless. This is because instead of paying for server uptime, you only pay for runtime usage (down to millisecond increments for Lambda) when you run your application.

Further considerations for choosing the right hosting platform

The following table provides considerations for implementing the most cost-effective model for some use cases you may encounter when building your architecture:

Building a cloud operating model and managing risk

Building an effective people, governance, and platform capability is summarized in the following sections and discussed in detail in Part 5: Organizing teams to enable effective build/run/manage.

People

If your team only builds applications on virtual machines, asking them to move to the cloud serverless model without sufficiently training them could go poorly. We suggest starting small. Select a handful of applications that have lower risk yet meaningful business value and allow your team to build their cloud “muscles.”

Governance

If your teams don’t have the “muscle memory” to make cloud architecture decisions, build a Cloud Center of Excellence (CCOE) to enforce a consistent approach to building in the cloud. Without this team, managing cost, security, and reliability will be harder. Ask the CCOE team to regularly review the application architecture suitability (cost, performance, resiliency) against changing business conditions. This will help you incrementally evolve architecture as appropriate.

Platform

In a typical on-premises environment, changes are deployed “in-place.” This requires a slow and “involve everyone” approach. Deploying in the cloud replaces the in-place approach with blue/green deployments, as shown in Figure 2.

With this strategy, new application versions can be deployed on new machines (green) running side by side with the old machines (blue). Once the new version is validated, switch traffic to the new (green) machines and they become production. This model reduces risk and increases velocity of change.

Figure 2. AWS blue/green deployment model

Securing your application and infrastructure

Security controls in the cloud are defined and enforced in software, which brings risks and opportunities. If not managed with a robust change management process, software-defined firewall misconfiguration can create unexpected threat vectors.

To avoid this, use cloud native patterns like “infrastructure as code” that express all infrastructure provisioning and configuration as declarative templates (JSON or YAML files). Then apply the same “Git pull request” process to infrastructure change management as you do for your applications to enforce strong governance. Use tools like AWS CloudFormation or AWS Cloud Development Kit (AWS CDK) to implement infrastructure templates into your cloud environment.

Apply a layered security model (“defense in depth”) to your application stack, as shown in Figure 3, to prevent against distributed denial of service (DDoS) and application layer attacks. Part 2: Protecting AWS account, data, and applications provides a detailed discussion on security.

Figure 3. Defense in depth

Data stores

How many is too many?

In on-premises environments, it is typically difficult to provision a separate database per microservice. As a result, the application or microservice isolation stops at the compute layer, and the database becomes the key shared dependency that slows down change.

The cloud provides API instantiable, fully managed databases like Amazon Relational Database Service (Amazon RDS) (SQL), Amazon DynamoDB (NoSQL), and others. This allows you to isolate your application end to end and create a more resilient architecture. For example, in a cell-based architecture where users are placed into self-contained, isolated application stack “cells,” the “blast radius” of an impact, such as application downtime or user experience degradation, is limited to each cell.

Database engines

Relational databases are typically the default starting point for many organizations. While relational databases offer speed and flexibility to bootstrap a new application, they bring complexity when you need to horizontally scale.

Your application needs will determine whether you use a relational or non-relational database. In the cloud, API instantiated, fully managed databases give you options to closely match your application’s use case. For example, in-memory databases like Amazon ElastiCache reduce latency for website content and key-value databases like DynamoDB provide a horizontally scalable backend for building an ecommerce shopping cart.

Summary

We acknowledge that CTO responsibilities can differ among organizations; however, this blog discusses common key considerations when building and operating an application in the cloud.

Choosing the right application hosting platform depends on your application’s use case and can impact the operational cost of your application in the cloud. Consider the people, governance, and platform aspects carefully because they will influence the success or failure of your cloud adoption. Use lower risk application deployment patterns in the cloud. Managed data stores in the cloud open your choice for data stores beyond relational. In the next post of this series, Part 2: Protecting AWS account, data, and applications, we will explore best practices and principles to apply when thinking about security in the cloud.

AWS Architecture Blog