亚马逊AWS官方博客

Use Cloud Foundations Product Factory to plan, design and one-click deploy infrastructural cloud resources such as multi-account access control and permission policies丨借助 Cloud Foundations 产品工厂规划设计并一键部署云上多账户访问控制及权限策略等基础设施资源

The Chinese version [5] of this blog post was originally published on March 14, 2024. We updated the product definitions based on the latest specifications when translating and republishing it in English.

The design and implementation of Cloud Foundations solution revolves around the 30 capabilities in 6 categories of Amazon Web Services’s (AWS) Cloud Foundations proof of concept. The building capabilities we previously focused on were mainly around cloud infrastructural aspects such as security, infrastructure, governance, and risk management, aiming at building a multi-account cloud operating environment that complies with best practices and security standards centered around the basic landing zone. Once it is established, workload isolation capability building is one of the most important follow-up tasks, also an essential part of the sustainable development of an enterprise’s journey to the cloud. This capability is designed to create and manage an isolated environment for workloads to reduce the impact of vulnerabilities and threats. One of its abilities is to provide pre-approved and deployable architectures. The core of this ability is to provide common and repeatable mechanisms to maintain controls and boundaries around the workloads through a library of pre-approved patterns. Simply put, this capability requires to support “define, approve, and deploy” architectures that serve workloads.

Architectures can be large or small, such as an elastic bastion host architecture where instances are managed by an autoscaling group [1]. In fact, breaking out of the traditional monolithic stereotype of “architecture” , we can regard any meaningful combination of resources as a reasonable architecture. Back to the topic of cloud environment construction, after the basic landing zone is built, it is generally necessary to do at least two construction items: cloud network connectivity and user access control. The biggest building difference between the basic landing zone and the two constructure items is involved in the disparities of standardization and customization. The basic landing zone is relatively the same for every customer, however, the network connectivity and permissions management could be completely different.

The cloud network connectivity architecture is mainly built through Cloud Foundations’ VPC-sharing, TGW-sharing network connectivity [2], network traffic inspection [3], and multi-stage pipelines [4], which can meet most complex requirements of network architecture construction. In the meanwhile, access control mainly involves the design and deployment of Amazon IAM-related users, roles, and policies within multiple member accounts, and resources such as Amazon IAM Identity Center (Identity Center) permission sets and directory groups in the Security Account. Another important aspect of access control is the integration of an external identity provider with the identity center. This topic is not covered in this article.

Each customer needs to create and manage very inconsistent access and permissions resources. Therefore, Cloud Foundations recently adds a “Product Factory” module to meet the automated creation and management needs of customized cloud resources. Hence, we can concretely view an architecture as a “product”. Define products in JSON format, generate pipelines through pre-built Amazon Service Catalog products, release and approve pipelines to deploy the product to specified accounts and regions. For example, it is possible to design and define a set of access control standards for a project, and establish pre-approved directory groups, permission sets, related customer policies, roles, etc. You can also use this as a template to establish similar access control resources for other projects, so as to meet the requirements of principal of least privilege and refined management.

Product factory design ideas

The design idea of a product factory can be simply summed up as “infrastructure-as-definition” (IaD). This is a further simplification upon the currently popular infrastructure-as-code (IaC). IaC requires writing code, compiling and executing code to manage cloud resources. Common IaC languages include Terraform, other high-level programming languages based on the AWS Cloud Development Kit (CDK), or Pulumi. IaD, on the other hand, does not require you to master a programming language, while you only need to know JSON syntax. Additionally, there are the following key differences between the two:

Item IaC IaD
Language Terraform, other high-level programming languages based on CDK or Pulumi JSON
Code storage Self-prepared, such as Amazon CodeCommit, GitHub, GitLab etc. Amazon AppConfig (can be modified to use GitLab, etc.)
Compile and runtime environment Prepares the corresponding instance hardware and software environments according to the programming languages Cloud Foundations reads and runs the product definitions via Amazon CodeBuild
CICD Self-prepared, such as Amazon CodePipeline, GitHub Actions, GitLab CI/CD etc. Cloud Foundations manages CICD processes via Amazon CodePipeline
Product catalog management Self-prepared Cloud Foundations manages and operates product catalog via Amazon Service Catalog

Judging from the comparison table above, Cloud Foundations’ IaD simplifies and optimizes the automated creation and standardized management process of cloud resources in various ways. As the name suggests, as long as the product is properly “defined,” Cloud Foundations can take care of everything else. This way, you can focus more on the product or architecture design itself, effectively reduce other complicated and bothering low-level matters, and further improve work efficiency and design quality.

Product definition basic structure

Let us disassemble products by the “divide and conquer” strategy. A product or architecture, no matter how complex, can be broken down to a group of small units, that is, a set of resources belonging to one service deployable for one account in one region. We can identify a set of service-account-region resources that make up the smallest unit of a product. The granularity of this smallest unit is defined as a service rather than a resource because granularity that is too small greatly affects the efficiency of product definition and deployment. We call the smallest unit of a product a “block”. More complex products can be made of different blocks stacked together like building blocks.

The smallest unit of product definition is a block, based on which product architectures large or small can be constructed. A block is a collection of different resources of the same service. There are dependencies among resources. Dependencies for the same service resources within a block are called intra-block dependencies. For example, IAM roles rely on customer policies, and directory group assignments in the identity center rely on permission sets. Since resources for different services must belong to different blocks, there are also dependencies between blocks. Dependencies on different service resources between different blocks are called inter-block dependencies. For example, you must have a KMS key before you can encrypt an S3 bucket, and an IAM service role before you can let a Lambda function to assume. Dependent blocks need to be created sequentially, while blocks without dependencies can be created simultaneously. Let’s take the product “Elastic bastion host architecture by an autoscaling group managed instances” in the September 2023 blog post [1] as an example. The product mainly contains 4 blocks involving 4 services (IAM, VPC, EC2, autoscaling). Among them, the EC2 block relies on the IAM block’s service role, the autoscaling block relies on the EC2 block’s launch template, finally the IAM block and VPC block have no dependencies. Based on this, we have arranged the build order as shown in the figure below.

Figure 1 Phased decomposition of the elastic bastion host architecture by an autoscaling group managed instances

If you look closely at the figure above, you would notice that it looks like a pipeline. Blocks within a stage can be executed concurrently, and each stage is executed in sequence. This allows for faster execution and satisfies inter-block dependencies. The first stage contains all blocks that don’t depend on other blocks. Starting from the second stage, the blocks it contains depend on the blocks in the previous stage. Let P be a product, S be a stage, and b be a block, then S = {b1, b2, … , bm}, a product can be expressed as P = {S1, S2, … , Sn}, and its length can be expressed as |P| = n. Given a certain number of blocks and their dependencies bi < bj, the optimal solution for defining a product can be transformed into a problem of how to minimize the length of the product.

Therefore, we can map product blocks and their sequential order to the relevant components of the CodePipeline pipeline: blocks are mapped as “actions”, concurrently executed block sets are mapped as “stages”, and the ordered execution of all blocks of a product is mapped as a “pipeline”. Further abstracted using mathematical symbols, a product can be represented as a two-dimensional array with elements in blocks. First-dimensional arrays are sequential stages, while second-dimensional arrays are concurrent blocks. For example, the image above can be expressed as “[IAM, VPC], [EC2], [autoscaling]”.

The basic structure of a product definition is a two-dimensional array, and an “appropriate” product structure can be constructed according to the arrangement of dependencies. The degree of suitability of the product can be reasonably designed according to actual business needs, simple or complex. According to the pipeline quota, a single stage cannot contain more than 50 product blocks, and a single product cannot contain more than 50 stages, so the total number of blocks is limited to 2500.

Products operation and maintenance

With the understanding of the smallest unit and basic structure of a product definition, we will introduce the daily products operation and maintenance process using the access control and permission resources, often involved in landing zone construction, as an example. Please refer to the “Cloud Foundations Product Definition Specification” for the specific meanings of the relevant resource properties, which will not be detailed here. So far, Product Factory supports the most important and frequently used 68 cloud resources across 38 AWS services.

Define a product

Suppose that according to the business requirements of a general hospital, it is requested to create two service roles for the Oncology Department for EC2 instances and Lambda functions on top of the basic landing zone created by Cloud Foundations to authorize reading related tumor data. Among them, the EC2 role also has view-only and all Lambda permissions. You also need to create a permission set for viewing tumor data and a corresponding directory group in the identity center.

[
  [
    {
      "accounts": ["123456789012"],
      "service": "iam",
      "policies": {
        "s3-oncology": {
          "statements": [{
              "actions": ["s3:ListBucket*", "s3:GetBucket*"],
              "resources": ["arn:${PARTITION}:s3:::my-oncology-data-123456789012"]
            }, {
              "actions": ["s3:GetObject*"],
              "resources": ["arn:${PARTITION}:s3:::my-oncology-data-123456789012/*"]
          }]
        }
      },
      "roles": {
        "ec2-oncology": {
          "trusts":  ["ec2"],                 "policies": ["s3-oncology"],
          "aws_policies": ["ViewOnlyAccess"], "services": ["lambda"]
        },
        "lambda-oncology": {"trusts": ["lambda"], "policies": ["s3-oncology"]}
      }
    },
    {
      "accounts": ["$.account.security"],
      "service": "iam",
      "policies": {
        "s3-oncology": {
          "statements": [{
              "actions": ["s3:ListBucket*", "s3:GetBucket*"],
              "resources": ["arn:${PARTITION}:s3:::my-oncology-data-123456789012"]
            }, {
              "actions": ["s3:GetObject*"],
              "resources": ["arn:${PARTITION}:s3:::my-oncology-data-123456789012/*"]
          }]
        }
      } 
    }
  ],
  [
    {
      "service": "sso",
      "permissions": {
        "s3-oncology": {"policies": ["$.s3-oncology"], "aws_policies": ["ViewOnlyAccess"]}
      },
      "groups": {"s3-oncology": {"s3-oncology": ["123456789012"]}}
    }
  ]
]

According to the above requirements and IAM and identity center related resource specifications of Product Factory, we define the access control product as shown in the code snippet above (you need to change the target account to a legal account). We divided the product into two stages. The first stage defines the customer policy s3-oncology in both the member account and the Security Account with the two service roles ec2-oncology, lambda-oncology in the member account only. The second stage defines the identity center permission set and directory group s3-oncology. The following figure marks intra-block dependencies with thin dotted lines and inter-block dependencies with thick dashed lines. IAM is not deployed in a particular region, so regional information is omitted. By default, the identity center is deployed to the Security Account in the main region, thus account and region information is omitted. In the definition we use the preset environment variable ${PARTITION}. Try not to use hard code to facilitate reuse in multiple environments.

Figure 2 Schematic representation of intra-block and inter-block dependencies of an access control product

Deploy and destroy a product

Refer to section 11 of the Cloud Foundations User Operation Manual. Perform the following main steps as the product-manager role:

  1. Create a new product application profile called product-ac-oncology;
  2. Input the product name and profile as product-ac-oncology, and the stage and variables as the default values;
  3. Launch the product;
  4. Release and approve the pipeline pipeline-product-ac-oncology-apply-fresh
  5. Assume the member account to confirm that one policy and two roles have been created;
  6. Assume the Security Account to confirm that the permission set and directory group have been created.

The order of destroying a product is the reverse to the deployment process, as follows:

  1. Release and approve the pipeline pipeline-product-ac-oncology-destroy
  2. Assume the member account to confirm the deletion of one policy and two roles;
  3. Assume the Security Account to confirm the deletion of the permission set and directory group;
  4. Terminate product product-ac-oncology;
  5. Delete the product application profile product-ac-oncology;

Multi-stage products operation and maintenance

In the previous section, we covered the entire process of defining, deploying and destroying relevant access control resources for the oncology department through the Product Factory. For actual business needs, there may be many departments that need to prepare access control, such as outpatient department, inpatient department, etc. Assuming that each department has similar access control requirements for their data respectively, a simple approach is to duplicate multiple copies of the above product definitions and replace the bucket names and member accounts thereof. However, this approach causes a lot of boilerplate and hard code, which hinders maintenance in batches at a later stage. A better approach is to use the stage and variables of Product Factory to flexibly update corresponding values, and dynamically generate access control resources adapted to multiple departments through a single product definition. If the product is grouped and applicable to a member account, we can use the stage as the account, referred to via ${STAGE}.

Define a product

We abstracted the member account and department name as stage and variables, and represent with STAGE and DEPARTMENT respectively. The updated product definition is shown below.

[
  [
    {
      "accounts": ["${STAGE}"],
      "service": "iam",
      "policies": {
        "s3-${DEPARTMENT}": {
          "statements": [{
              "actions": ["s3:ListBucket*", "s3:GetBucket*"],
              "resources": ["arn:${PARTITION}:s3:::my-${DEPARTMENT}-data-${STAGE}"]
            }, {
              "actions": ["s3:GetObject*"],
              "resources": ["arn:${PARTITION}:s3:::my-${DEPARTMENT}-data-${STAGE}/*"]
          }]
        }
      },
      "roles": {
        "ec2-${DEPARTMENT}": {
          "trusts":  ["ec2"],                 "policies": ["s3-${DEPARTMENT}"],
          "aws_policies": ["ViewOnlyAccess"], "services": ["lambda"]
        },
        "lambda-${DEPARTMENT}": {"trusts": ["lambda"], "customer": ["s3-${DEPARTMENT}"]}
      }
    },
    {
      "accounts": ["$.account.security"],
      "service": "iam",
      "policies": {
        "s3-${DEPARTMENT}": {
          "statements": [{
              "actions": ["s3:ListBucket*", "s3:GetBucket*"],
              "resources": ["arn:${PARTITION}:s3:::my-${DEPARTMENT}-data-${STAGE}/"]
            }, {
              "actions": ["s3:GetObject*"],
              "resources": ["arn:${PARTITION}:s3:::my-${DEPARTMENT}-data-${STAGE}/*"]
          }]
        }
      } 
    }
  ],
  [
    {
      "service": "sso",
      "permissions": {
        "s3-${DEPARTMENT}": {
          "policies": ["$.s3-${DEPARTMENT}"], "aws_policies": ["ViewOnlyAccess"]
        }
      },
      "groups": {"s3-${DEPARTMENT}": {"s3-${DEPARTMENT}": ["${STAGE}"]}}
    }
  ]
]

Deploy and destroy a product

We define and deploy resources by department, starting with the Oncology Department as the product-manager role. Assume that all of the resources in the previous section relating to the Oncology Department have been destroyed:

  1. Create a new product application profile called product-ac-s3data;
  2. Input the product name as product-ac-oncology, the profile as product-ac-s3data, the stage as 123456789012, and the variables as {"DEPARTMENT": "oncology"};
  3. Launch the product;
  4. Release and approve pipeline pipeline-product-ac-s3data-oncology-apply-fresh
  5. Assume the member account to confirm that one policy and two roles have been created;
  6. Assume the Security Account to confirm that the permission set and directory group have been created.

By repeating the steps above, it is possible to deploy different access control resources for outpatient and inpatient department and their corresponding member accounts using the same product definition profile, as shown in the following table. We won’t go into detail here and leave it to the readers to practice on their own.

Department Product Name Stage Variables
Clinic product-ac-clinic 123456789013 {“DEPARTMENT”: “clinic”}
Inpatient product-ac-inpatient 123456789014 {“DEPARTMENT”: “inpatient”}

The order of destroying a product is the reverse to the deployment process, as follows:

  1. Release and approve the pipeline pipeline-product-ac-s3data-oncology-destroy
  2. Assume the member account to confirm the deletion of one policy and two roles;
  3. Assume the Security Account to confirm the deletion of the permission set and directory group;
  4. Terminate product product-ac-oncology;
  5. Delete the product application profile product-ac-oncology;
  6. Repeat the above steps to destroy and terminate all products based on the product-ac-s3data profile;
  7. Delete the product application profile;

Build a blueprint for account creation

As we can see from the previous section on multi-stage products operation and maintenance, flexible use of stage and variables can define and build multiple deployments for a single product. When a stage is specified as an account number, product deployment can be regarded as a resource creation process for that account. We call this type of product definition an account creation blueprint. For example, you can prepare the corresponding blueprint product definition blueprint-* for different types of accounts. Once the Account Factory has created an account, you can easily standardize the account customization process by launching the corresponding blueprint definition products with the new account number as stage, similar to AWS Control Tower‘s Account Factory Customization (AFC). The difference is that AFC can only apply one blueprint for one account, while Cloud Foundations can apply multiple blueprint products simultaneously.

Conclusion

This article mainly introduces the design ideas and basic structure of Cloud Foundations’ new module Product Factory. Using common access control resources in the landing zone as an example, products operation and maintenance are demonstrated in detail, including how to define, deploy, and destroy them. We also discussed the use of the multi-stage characteristics of the product to simply and efficiently perform large-scale operation and maintenance of access control resources related to multiple departments by abstracting common attributes through stage and variables. For details, please refer to the accompanying User Operation Manual and Product Definition Specification.

By extension, you can use Product Factory to build more and more complex products and architectures based on a wide range of AWS services and resources, so that all product resources in the cloud infrastructure are fully defined, automated, and standardized. One of the standards for the scientific definition and effective use of Product Factory is the high quality of service to your business needs and production practices. Cloud Foundations will be continuously updated to support more services and resources to help you with various construction requirements for standardized management and automated operation and maintenance of cloud resources.

References

  1. Blog post: Deploy elastic bastion hosts in one-click for secure session management and port forwarding with Cloud Foundations,2023-09
  2. Blog post: Use Cloud Foundations to achieve overall planning and one-click deployment of two network sharing models of multi-account organization in cloud environments,2023-02
  3. Blog post: Use Cloud Foundations to plan and design multi-regional hub-spoke network topology on the cloud and one-click deploy east-west south-north traffic inspection separated or combined,2023-11
  4. Blog post: Use Cloud Foundations shared network products to plan, design and one-click deploy cross-regional multiple cloud networks on multiple network accounts,2024-02
  5. Blog post: 借助 Cloud Foundations 产品工厂规划设计并一键部署云上多账户访问控制及权限策略等基础设施资源, 2024-03

本篇作者

Clement Yuan

亚马逊云科技专业服务部顾问。曾在亚马逊美国西雅图总部工作多年,就职于 Amazon Relational Database Service (RDS) 关系型数据库服务开发团队。拥有丰富的软件开发及云上运维经验。现负责业务持续性及可扩展性运行、企业应用及数据库上云和迁移、云上灾难恢复管理、云上良好架构框架等架构咨询、方案设计及项目实施工作。

刘育新

亚马逊云科技 ProServe 团队高级顾问,长期从事企业客户入云解决方案的制定和项目的实施工作。