使用 Terraform 实现自动化管理多个环境

如何通过主 (main) 账户管理开发环境 (dev)、测试环境 (test) 和生产环境 (prod) 账户的共享基础设施。
发布时间:2023 年 3 月 27 日
Terraform
CodeCatalyst
CI-CD
基础设施即代码
Github-Actions
DevOps
教程
亚马逊云科技
Olawale Olaleye
亚马逊云科技使用经验
300 - 高级
完成所需时间
60 分钟
所需费用
支持亚马逊云科技免费套餐
前提条件

注册 / 登录 亚马逊云科技账户
CodeCatalyst 账户
Terraform 1.3.7+
(可选)GitHub 账户

上次更新时间
2023 年 3 月 7 日

随着团队不断发展壮大,管理和协调应用环境更改也变得越来越复杂。虽然小型团队可以使用一个账户来预配所有基础设施和部署所有系统,但是您可能会遇到这样一种情况:当多数人同时进行更改时,就无法进行全面管理了。此外,如果所有基础设施都在同一账户中,就难以完全遵循最小权限原则(相关人员或服务只应拥有完成其预期任务所需的最小权限),更不用说遵循资源命名规则了。

本教程将指导您将基础设施拆分到多个账户中。具体方法是创建一个 main 账户,用于管理所有环境(例如用户、构建管道、构建工件等)共享的所有通用基础设施;并针对实现应用程序的三个阶段创建dev、test 和 prod 对应的环境账户。拆分为不同的环境可以满足不断发展壮大的团队需求,并通过将不同环境隔离到单独的亚马逊云科技账户中,实现清晰明确的环境分离。该方法需要使用 TerraformAmazon CodeCatalyst 以及账户 Identity and Access Management (IAM) 角色。本教程改编自作者 2020 年的 HashiTalks 讲座。本教程包含以下几个方面的内容:

  • 如何将基础设施拆分到多个账户和多个存储库中
  • 如何使用 Terraform 来设置一个构建和部署管道,用于管理环境账户中的所有更改
  • 如何同步 dev、test 和 prod 中的基础设施且无需手动复制相关文件

为主账户设置 CI/CD 管道

第一步,为 main 账户中的所有共享基础设施设置一个 CI/CD 管道。Terraform 引导教程中详细介绍了相关操作方法。所以,这里我们不再赘述该教程中的详细说明,仅介绍操作步骤。如果您想了解更多详情,请参阅Terraform 引导教程。完成 Terraform 引导教程的试验后,不执行资源清理步骤,而直接跳到下一节,继续完成本教程试验。下面简单概括一下所有步骤。

设置管道之前,请确认您已登录您的亚马逊云科技账户和 CodeCatalyst 账户。然后,设置项目,环境、存储库和 CI/CD 管道。在 CodeCatalyst 中,执行以下步骤:

  1. 选择“Start from scratch”(从头开始创建)选项,创建一个名为 TerraformMultiAccount 的新项目。
  2. 在该项目中创建名为 main-infra 的代码库,并使用 Terraform 来配置 .gitignore 文件。
  3. 选择 CI/CD > Environments(环境),创建名为 MainAccount 的环境,并将一个亚马逊云科技账户关联到该项目。
  4. 选择 CI/CD > Environments (环境),创建在 Cloud9 中打开的开发环境,并从 main 分支克隆 main-infra 存储库。
  5. 启动该 dev 环境,并运行 aws configure 配置 Amazon CLI,将登录凭证配置为您的亚马逊云科技账户下一个 IAM 用户的凭证。

启动 Terraform 引导

首先,在账户中启动 Terraform 引导程序。请在您的 Cloud9 开发环境终端中按照以下步骤操作:

# Install specific Terraform version
TF_VERSION=1.3.7
wget -O terraform.zip https://releases.hashicorp.com/terraform/${TF_VERSION}/terraform_${TF_VERSION}_linux_amd64.zip
unzip terraform.zip
rm terraform.zip
sudo mv terraform /usr/bin/terraform
sudo chmod +x /usr/bin/terraform

在 main-infra 目录中运行以下命令:

# Set up required resources for Terraform to use to bootstrap
mkdir -p _bootstrap
wget -P _bootstrap/ https://raw.githubusercontent.com/build-on-aws/bootstrapping-terraform-automation/main/_bootstrap/codecatalyst/main_branch_iam_role.tf
wget -P _bootstrap/ https://raw.githubusercontent.com/build-on-aws/bootstrapping-terraform-automation/main/_bootstrap/codecatalyst/pr_branch_iam_role.tf
wget -P _bootstrap/ https://raw.githubusercontent.com/build-on-aws/bootstrapping-terraform-automation/main/_bootstrap/codecatalyst/providers.tf
wget -P _bootstrap/ https://raw.githubusercontent.com/build-on-aws/bootstrapping-terraform-automation/main/_bootstrap/codecatalyst/state_file_resources.tf
wget -P _bootstrap/ https://raw.githubusercontent.com/build-on-aws/bootstrapping-terraform-automation/main/_bootstrap/codecatalyst/variables.tf

创建所需的基础设施。使用 Amazon S3 作为后端存储来存储状态文件,使用 Amazon DynamoDB 来管理锁,确保一次只进行一项更改,并设置两个工作流使用的 IAM 角色。编辑 variables.tf 并将 state_file_bucket_name 的值更改为您的 S3 存储桶的唯一名称。在本教程演示中,我们将存储桶命名为 tf-multi-account。在您的实际使用场景中,请根据实际情况设置存储桶的唯一名称。初始化 Terraform 后端,然后应用这些更改为 CI/CD 管道创建状态文件、锁定表和 IAM 角色。如果您需要使用其他亚马逊云科技区域,请修改对应字符串中 aws_region 变量的值。运行以下命令:

terraform init
terraform apply

现在,我们可以在状态文件中跟踪我们的资源了,但该状态文件储存在开发环境的本地存储中,我们需要为它配置 S3 后端存储。首先,添加 S3 后端存储配置,然后将状态文件迁移到 S3 存储桶。运行以下命令来添加后端配置

wget -P _bootstrap/ https://raw.githubusercontent.com/build-on-aws/bootstrapping-terraform-automation/main/_bootstrap/codecatalyst/terraform.tf

编辑 _bootstrap/terraform.tf,将 bucket 值更改为您的存储桶的名称。Terraform 变量无法直接用于后端配置,因此这一操作需要手动完成。如果更改了 provider 代码块中使用的区域,也请更新 terraform.tf 中的 region 值。运行以下命令,将状态文件迁移到 S3 存储桶:

terraform init -migrate-state

接下来,为两个新的 IAM 角色添加访问 CodeCatalyst 空间和项目的权限。在您的空间中,前往 AWS accounts(亚马逊云科技账户) 页签,点击您的亚马逊云科技账号,然后在亚马逊云科技管理控制台中点击 Manage roles(管理角色)。在打开的新页签中,选择 Add an existing role you have created in IAM(添加现有 IAM 角色),然后从下拉菜单中选择 Main-Branch-Infrastructure。点击 Add role(添加角色)。添加角色后,弹出的页面顶部显示一条绿色横幅 Successfully added IAM role Main-Branch-Infrastructure.(已成功添加 IAM 角色 Main-Branch-Infrastructure)。点击 Add IAM role(添加 IAM 角色),然后按照相同的流程操作,添加 PR-Branch-Infrastructure 角色。完成后,可以关闭此窗口并返回 CodeCatalyst 窗口。

最后,创建工作流。这些工作流将在处理所有拉取请求 (PR) 时运行 terraform plan,然后在处理并入 main 分支的拉取请求时运行 terraform apply。运行以下命令:

mkdir -p .codecatalyst/workflows

wget -P .codecatalyst/workflows https://raw.githubusercontent.com/build-on-aws/manage-multiple-environemnts-with-terraform/main/.codecatalyst/workflows/main_branch.yml
wget -P .codecatalyst/workflows https://raw.githubusercontent.com/build-on-aws/manage-multiple-environemnts-with-terraform/main/.codecatalyst/workflows/pr_branch.yml

wget https://raw.githubusercontent.com/build-on-aws/bootstrapping-terraform-automation/main/providers.tf
wget https://raw.githubusercontent.com/build-on-aws/bootstrapping-terraform-automation/main/terraform.tf
wget https://raw.githubusercontent.com/build-on-aws/bootstrapping-terraform-automation/main/variables.tf

编辑 .codecatalyst/workflows/main_branch.yml 和 .codecatalyst/workflows/pr_branch.yml,将 111122223333 替换为您的亚马逊云科技账户 ID 。编辑 terraform.tf,将 bucket 的值替换为存储状态文件的 S3 存储桶名称。您也可以根据实际情况更改 region 的值。此值应与 _bootstrap/terraform.tf 后端配置中 region 值相同,因为是指同一个存储桶。您可以更改 variables.tf 中的 aws_region 变量值,设置要创建的基础设施所在的区域。最后,确认 Terraform 代码格式正确,然后提交所有更改。命令如下:

terraform fmt

git add .
git commit -m "Setting main infra ci-cd workflows"
git push

前往 CI/CD -> Workflows(工作流),确认 main 分支的工作流 MainBranch 正常运行并成功完成。现在,我们已实现了单个亚马逊云科技账户中的基础设施自动化管理配置。接下来,我们将设置更多环境账户。

为不同的环境创建亚马逊云科技账户

我们还需要设置三个新的亚马逊云科技账户,分别用于 dev、test 和 prod 环境。过程和启动基本基础设施自动化管理配置的引导类似。我们将使用 main-infra 存储库来实现管理。我们已经有了一个用于存储状态文件的存储桶,因此我们将使用同一存储桶,但会更改环境账户 backend 配置代码块中的 key,这样才不会覆盖已有的配置。由于我们将使用不同的亚马逊云科技账户,因此我们需要一种机制来让工作流访问这些账户。我们将通过 IAM 角色来授权。主账户可以扮演环境账户的角色,执行由该角色定义的操作。这些操作是该 IAM 角色的信任策略中定义的。然后使用 main 账户为现有 IAM 工作流程角色添加更多策略,使这些角色能够扮演各个环境账户中的对应角色。这意味着,拉取请求 (PR) 分支角色只能扮演各个环境账户中的 PR 分支角色,以免基础设施被意外更改;同样,main 分支角色只能扮演环境账户中的 main 分支角色。下图直观地展示了该流程:

我们需要创建和修改以下资源:

  1. 新的亚马逊云科技账户:通过 Amazon Organizations 创建三个新账户。
  2. 新的 IAM 角色:创建工作流可以扮演的环境账户中的角色。一个适用于 main 分支的角色,一个适用于任何拉取请求 (PR) 的角色,其中包含了从 main 账户访问环境账户所需的信任策略。
  3. 新的 IAM 策略:设置工作流 IAM 角色在环境账户中的权限边界:main 分支拥有完整的管理员权限,可以创建基础设施;PR 分支拥有 ReadOnly 权限,可以验证更改。
  4. IAM 策略:分别为 main 和 PR 分支各添加一个策略,使这些分支能够扮演环境账户中的对应角色
  5. IAM 角色:添加允许扮演环境账户 IAM 角色的策略。

为此,将以下代码添加到 main-infra 的根目录下的 variables.tf 文件中,指定用于创建子账户的三个电子邮件地址,以及要创建来访问账户的 IAM 角色名称。在各个环境账户中创建的该 IAM 角色都具有管理员权限,并且可由顶级账户扮演。我们后面将会为环境账户中的工作流设置更多 IAM 角色。

提示:您可以在电子邮件地址中使用 + 符号来添加额外的唯一电子邮件地址字符串。此类地址收到的所有邮件都将投递至同一电子邮件收件箱。例如:如果您的电子邮件是 john@example.com,则可以使用 john+aws-dev@example.com。此项设置非常实用。因为每个亚马逊云科技账户都需要使用全球唯一的电子邮件地址,但管理多个收件箱可能很麻烦。
variable "iam_account_role_name" {
 type = string
 default = "Org-Admin"
}

variable "account_emails" {
 type = map(any)
 default = {
 dev : "tf-demo+dev@example.com",
 test : "tf-demo+test@example.com",
 prod : "tf-demo+prod@example.com",
 }
}

创建名为 aws_environment_accounts.tf 的新文件,使用 Amazon Organizations 创建账户。此外,请将这些账户添加到组织单元 (OU) 中。如果您需要对多个亚马逊云科技账户应用特定的规则,那么这种账户组织方式会非常实用。在该文件中添加以下代码:

# Set up the organization
resource "aws_organizations_organization" "org" {
 aws_service_access_principals = [
 "cloudtrail.amazonaws.com",
 "config.amazonaws.com",
 "sso.amazonaws.com",
 ]

 feature_set = "ALL"
}

# Create a new OU for environment accounts
resource "aws_organizations_organizational_unit" "environments" {
 name = "environments"
 parent_id = aws_organizations_organization.org.roots[0].id
}

# Create a new AWS account called "dev"
resource "aws_organizations_account" "dev" {
 name = "dev"
 email = lookup(var.account_emails, "dev")
 role_name = var.iam_account_role_name
 parent_id = aws_organizations_organizational_unit.environments.id

 depends_on = [aws_organizations_organization.org]
}

# Create a new AWS account called "test"
resource "aws_organizations_account" "test" {
 name = "test"
 email = lookup(var.account_emails, "test")
 role_name = var.iam_account_role_name
 parent_id = aws_organizations_organizational_unit.environments.id

 depends_on = [aws_organizations_organization.org]
}

# Create a new AWS account called "prod"
resource "aws_organizations_account" "prod" {
 name = "prod"
 email = lookup(var.account_emails, "prod")
 role_name = var.iam_account_role_name
 parent_id = aws_organizations_organizational_unit.environments.id

 depends_on = [aws_organizations_organization.org]
}
提示:如果您要将这一策略应用于现有亚马逊云科技账户并使用 Terraform 的 import 函数,那么请留意资源页面中有关如何在导入时避免重复创建账户的说明。此外,如果您已经在自己的账户中设置了一个 Amazon Organizations 组织,则还需要账户加入该组织。方法是运行 terraform import aws_organizations_organization.org 111122223333。注意:请务必将示例中的账户 ID 数字替换成您的亚马逊云科技账户 ID。

接下来,我们检查一下这些更改,然后将这些更改应用到 CodeCatalyst 终端。方法是先运行 terraform init,然后运行 terraform plan。(注意:如果您是接着上一个教程继续操作,您也可以执行这些步骤,而不会造成任何问题。

terraform init
terraform plan

建议的更改应该类似以下列表:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
 + create

Terraform will perform the following actions:

 # aws_organizations_account.dev will be created
 + resource "aws_organizations_account" "dev" {
 + arn = (known after apply)
 + close_on_deletion = false
 + create_govcloud = false
 + email = "tf-demo+dev@example.com"
 + govcloud_id = (known after apply)
 + id = (known after apply)
 + joined_method = (known after apply)
 + joined_timestamp = (known after apply)
 + name = "dev"
 + parent_id = (known after apply)
 + role_name = "Org-Admin"
 + status = (known after apply)
 + tags_all = (known after apply)
 }

 # aws_organizations_account.prod will be created
 + resource "aws_organizations_account" "prod" {
 + arn = (known after apply)
 + close_on_deletion = false
 + create_govcloud = false
 + email = "tf-demo+prod@example.com"
 + govcloud_id = (known after apply)
 + id = (known after apply)
 + joined_method = (known after apply)
 + joined_timestamp = (known after apply)
 + name = "prod"
 + parent_id = (known after apply)
 + role_name = "Org-Admin"
 + status = (known after apply)
 + tags_all = (known after apply)
 }

 # aws_organizations_account.test will be created
 + resource "aws_organizations_account" "test" {
 + arn = (known after apply)
 + close_on_deletion = false
 + create_govcloud = false
 + email = "tf-demo+test@example.com"
 + govcloud_id = (known after apply)
 + id = (known after apply)
 + joined_method = (known after apply)
 + joined_timestamp = (known after apply)
 + name = "test"
 + parent_id = (known after apply)
 + role_name = "Org-Admin"
 + status = (known after apply)
 + tags_all = (known after apply)
 }

 # aws_organizations_organization.org will be created
 + resource "aws_organizations_organization" "org" {
 + accounts = (known after apply)
 + arn = (known after apply)
 + aws_service_access_principals = [
 + "cloudtrail.amazonaws.com",
 + "config.amazonaws.com",
 + "sso.amazonaws.com",
 ]
 + feature_set = "ALL"
 + id = (known after apply)
 + master_account_arn = (known after apply)
 + master_account_email = (known after apply)
 + master_account_id = (known after apply)
 + non_master_accounts = (known after apply)
 + roots = (known after apply)
 }

 # aws_organizations_organizational_unit.environments will be created
 + resource "aws_organizations_organizational_unit" "environments" {
 + accounts = (known after apply)
 + arn = (known after apply)
 + id = (known after apply)
 + name = "environments"
 + parent_id = (known after apply)
 + tags_all = (known after apply)
 }

Plan: 5 to add, 0 to change, 0 to destroy.

运行 terraform apply 应用这些更改:

terraform apply

在运行 apply 期间或在运行完成后不久,您将收到新账户的欢迎电子邮件。如果您是首次在账户中使用 Amazon Organizations 组织,则需要留意主题为 AWS Organizations email verification request(Amazon Organizations 电子邮件验证请求)的电子邮件。您需要点击 Verify email address(验证电子邮件地址)完成电子邮件地址验证。

您可能会有疑问:为什么要直接运行这些命令,而不是使用之前设置的工作流?原因是设置环境账户,我们会对 main-infra 存储库进行许多更改。我们需要将这些更改应用到这些账户中,然后才能继续进行后续设置。虽然可以在拉取请求时执行这一操作,但是这只是一个一次性操作。如果在拉取请求时执行,会导致拉取请求完成时间增长。

设置 Terraform 在各环境账户上的权限

接下来,我们需要为新账户创建 IAM 角色和策略,并新增策略来更新已有的 IAM 角色,以允许这些角色扮演环境账户中的新角色。首先,我们需要新增 AWS provider,让 Terraform 能够在新账户中创建基础设施。您可以使用 Terraform provider 别名,为不同的 provider 采用不同的配置。在这个试验中,我们将创建使用主账户的默认 provider,然后分别为 dev、test 和 prod 环境账户创建 provider,并通过 IAM 角色切换来访问这些账户。我们将使用新账户的 IAM 角色。将以下代码添加到 main-infra 的根目录下的 providers.tf 文件中。由于新的亚马逊云科技账户是使用 Terraform 创建的,因此我们可以通过这些资源来引用账户 ID:

provider "aws" {
 alias = "dev"
 region = var.aws_region
 assume_role {
 role_arn = "arn:aws:iam::${aws_organizations_account.dev.id}:role/${var.iam_account_role_name}"
 session_name = "dev-account-from-main"
 }
}

provider "aws" {
 alias = "test"
 region = var.aws_region
 assume_role {
 role_arn = "arn:aws:iam::${aws_organizations_account.test.id}:role/${var.iam_account_role_name}"
 session_name = "test-account-from-main"
 }
}

provider "aws" {
 alias = "prod"
 region = var.aws_region
 assume_role {
 role_arn = "arn:aws:iam::${aws_organizations_account.prod.id}:role/${var.iam_account_role_name}"
 session_name = "prod-account-from-main"
 }
}

现在,我们可以通过这些新的 provider 来访问环境账户。接下来,还需要为每个账户内的 main 分支和 PR 分支创建工作流使用的 IAM 角色,以及为所有用户创建一个具有只读权限的角色。这将确保 main 账户中用于 PR 分支的 ReadOnly 角色只能扮演各个环境账户中对应的 ReadOnly 角色,各个账户中的 AdministratorAccess 角色也是同理。每个环境账户中还有一个适用于团队成员的 ReadOnly 角色。扮演该角色的团队成员应该不能使用亚马逊云科技管理控制台、Amazon CLI 或 API 来进行任何更改。运行以下命令在各个账户中添加角色:

wget https://raw.githubusercontent.com/build-on-aws/manage-multiple-environments-with-terraform/main/env_accounts_main_branch_iam_role.tf
wget https://raw.githubusercontent.com/build-on-aws/manage-multiple-environments-with-terraform/main/env_accounts_pr_branch_iam_role.tf
wget https://raw.githubusercontent.com/build-on-aws/manage-multiple-environments-with-terraform/main/env_accounts_users_read_only_iam_role.tf

下载的文件内容如下所示:

  • env_accounts_main_branch_iam_role.tf
  • # Policy allowing the PR branches in our repo to assume the role from each environment account.
    data "aws_iam_policy_document" "env_accounts_pr_branch_assume_role_policy" {
      statement {
        actions = ["sts:AssumeRole"]
        effect  = "Allow"
        principals {
          type = "AWS"
          identifiers = [
            "arn:aws:iam::${data.aws_organizations_organization.org.master_account_id}:root"
          ]
        }
      }
    }
    
    # Dev Account IAM Roles
    # =====================
    
    # Role to allow the PR branch to use this dev AWS account
    resource "aws_iam_role" "dev_pr_branch" {
      provider           = aws.dev
      name               = "PR-Branch-Infrastructure"
      assume_role_policy = data.aws_iam_policy_document.env_accounts_pr_branch_assume_role_policy.json
    }
    
    # Allow role read-only rights in the account to run "terraform plan"
    resource "aws_iam_role_policy_attachment" "dev_readonly_policy_pr_branch" {
      provider   = aws.dev
      role       = aws_iam_role.dev_pr_branch.name
      policy_arn = "arn:aws:iam::aws:policy/ReadOnlyAccess"
    }
    
    
    # Test Account IAM Roles
    # ======================
    
    # Role to allow the PR branch to use this test AWS account
    resource "aws_iam_role" "test_pr_branch" {
      provider           = aws.test
      name               = "PR-Branch-Infrastructure"
      assume_role_policy = data.aws_iam_policy_document.env_accounts_pr_branch_assume_role_policy.json
    }
    
    # Allow role read-only rights in the account to run "terraform plan"
    resource "aws_iam_role_policy_attachment" "test_readonly_policy_pr_branch" {
      provider   = aws.test
      role       = aws_iam_role.test_pr_branch.name
      policy_arn = "arn:aws:iam::aws:policy/ReadOnlyAccess"
    }
    
    
    # Prod Account IAM Roles
    # ======================
    
    # Role to allow the PR branch to use this prod AWS account
    resource "aws_iam_role" "prod_pr_branch" {
      provider           = aws.prod
      name               = "PR-Branch-Infrastructure"
      assume_role_policy = data.aws_iam_policy_document.env_accounts_pr_branch_assume_role_policy.json
    }
    
    # Allow role read-only rights in the account to run "terraform plan"
    resource "aws_iam_role_policy_attachment" "prod_readonly_policy_pr_branch" {
      provider   = aws.prod
      role       = aws_iam_role.prod_pr_branch.name
      policy_arn = "arn:aws:iam::aws:policy/ReadOnlyAccess"
    }
    
    # Adding permission to the PR-Branch role in the Main account to assume the 
    # PR-Branch role in each environment account.
    data "aws_iam_policy_document" "pr_branch_assume_role_in_environment_account_policy" {
      statement {
        actions = ["sts:AssumeRole"]
        effect  = "Allow"
        resources = [
          aws_iam_role.dev_pr_branch.arn,
          aws_iam_role.test_pr_branch.arn,
          aws_iam_role.prod_pr_branch.arn
        ]
      }
    }
    
    # Since the IAM role was created as part of the bootstrapping, we need to 
    # reference it using a data source to add the additional policy to allow
    # role switching.
    data "aws_iam_role" "pr_branch" {
      name = "PR-Branch-Infrastructure"
    }
    
    resource "aws_iam_policy" "pr_branch_role_assume_environment_accounts_roles" {
      name        = "PR-Branch-Assume-Environment-Account-Role"
      path        = "/"
      description = "Policy allowing the PR branch role to assume the equivalent role in the environment accounts."
      policy      = data.aws_iam_policy_document.pr_branch_assume_role_in_environment_account_policy.json
    }
    
    resource "aws_iam_role_policy_attachment" "pr_branch_role_assume_environment_accounts_roles" {
      role       = data.aws_iam_role.pr_branch.name
      policy_arn = aws_iam_policy.pr_branch_role_assume_environment_accounts_roles.arn
    }
    
  • env_accounts_pr_branch_iam_role.tf
  • # Policy allowing the PR branches in our repo to assume the role from each environment account.
    data "aws_iam_policy_document" "env_accounts_pr_branch_assume_role_policy" {
      statement {
        actions = ["sts:AssumeRole"]
        effect  = "Allow"
        principals {
          type = "AWS"
          identifiers = [
            "arn:aws:iam::${data.aws_organizations_organization.org.master_account_id}:root"
          ]
        }
      }
    }
    
    # Dev Account IAM Roles
    # =====================
    
    # Role to allow the PR branch to use this dev AWS account
    resource "aws_iam_role" "dev_pr_branch" {
      provider           = aws.dev
      name               = "PR-Branch-Infrastructure"
      assume_role_policy = data.aws_iam_policy_document.env_accounts_pr_branch_assume_role_policy.json
    }
    
    # Allow role read-only rights in the account to run "terraform plan"
    resource "aws_iam_role_policy_attachment" "dev_readonly_policy_pr_branch" {
      provider   = aws.dev
      role       = aws_iam_role.dev_pr_branch.name
      policy_arn = "arn:aws:iam::aws:policy/ReadOnlyAccess"
    }
    
    
    # Test Account IAM Roles
    # ======================
    
    # Role to allow the PR branch to use this test AWS account
    resource "aws_iam_role" "test_pr_branch" {
      provider           = aws.test
      name               = "PR-Branch-Infrastructure"
      assume_role_policy = data.aws_iam_policy_document.env_accounts_pr_branch_assume_role_policy.json
    }
    
    # Allow role read-only rights in the account to run "terraform plan"
    resource "aws_iam_role_policy_attachment" "test_readonly_policy_pr_branch" {
      provider   = aws.test
      role       = aws_iam_role.test_pr_branch.name
      policy_arn = "arn:aws:iam::aws:policy/ReadOnlyAccess"
    }
    
    
    # Prod Account IAM Roles
    # ======================
    
    # Role to allow the PR branch to use this prod AWS account
    resource "aws_iam_role" "prod_pr_branch" {
      provider           = aws.prod
      name               = "PR-Branch-Infrastructure"
      assume_role_policy = data.aws_iam_policy_document.env_accounts_pr_branch_assume_role_policy.json
    }
    
    # Allow role read-only rights in the account to run "terraform plan"
    resource "aws_iam_role_policy_attachment" "prod_readonly_policy_pr_branch" {
      provider   = aws.prod
      role       = aws_iam_role.prod_pr_branch.name
      policy_arn = "arn:aws:iam::aws:policy/ReadOnlyAccess"
    }
    
    # Adding permission to the PR-Branch role in the Main account to assume the 
    # PR-Branch role in each environment account.
    data "aws_iam_policy_document" "pr_branch_assume_role_in_environment_account_policy" {
      statement {
        actions = ["sts:AssumeRole"]
        effect  = "Allow"
        resources = [
          aws_iam_role.dev_pr_branch.arn,
          aws_iam_role.test_pr_branch.arn,
          aws_iam_role.prod_pr_branch.arn
        ]
      }
    }
    
    # Since the IAM role was created as part of the bootstrapping, we need to 
    # reference it using a data source to add the additional policy to allow
    # role switching.
    data "aws_iam_role" "pr_branch" {
      name = "PR-Branch-Infrastructure"
    }
    
    resource "aws_iam_policy" "pr_branch_role_assume_environment_accounts_roles" {
      name        = "PR-Branch-Assume-Environment-Account-Role"
      path        = "/"
      description = "Policy allowing the PR branch role to assume the equivalent role in the environment accounts."
      policy      = data.aws_iam_policy_document.pr_branch_assume_role_in_environment_account_policy.json
    }
    
    resource "aws_iam_role_policy_attachment" "pr_branch_role_assume_environment_accounts_roles" {
      role       = data.aws_iam_role.pr_branch.name
      policy_arn = aws_iam_policy.pr_branch_role_assume_environment_accounts_roles.arn
    }
    
  • env_accounts_users_read_only_iam_role.tf
  • # Policy allowing the PR branches in our repo to assume the role from each environment account.
    data "aws_iam_policy_document" "env_accounts_pr_branch_assume_role_policy" {
      statement {
        actions = ["sts:AssumeRole"]
        effect  = "Allow"
        principals {
          type = "AWS"
          identifiers = [
            "arn:aws:iam::${data.aws_organizations_organization.org.master_account_id}:root"
          ]
        }
      }
    }
    
    # Dev Account IAM Roles
    # =====================
    
    # Role to allow the PR branch to use this dev AWS account
    resource "aws_iam_role" "dev_pr_branch" {
      provider           = aws.dev
      name               = "PR-Branch-Infrastructure"
      assume_role_policy = data.aws_iam_policy_document.env_accounts_pr_branch_assume_role_policy.json
    }
    
    # Allow role read-only rights in the account to run "terraform plan"
    resource "aws_iam_role_policy_attachment" "dev_readonly_policy_pr_branch" {
      provider   = aws.dev
      role       = aws_iam_role.dev_pr_branch.name
      policy_arn = "arn:aws:iam::aws:policy/ReadOnlyAccess"
    }
    
    
    # Test Account IAM Roles
    # ======================
    
    # Role to allow the PR branch to use this test AWS account
    resource "aws_iam_role" "test_pr_branch" {
      provider           = aws.test
      name               = "PR-Branch-Infrastructure"
      assume_role_policy = data.aws_iam_policy_document.env_accounts_pr_branch_assume_role_policy.json
    }
    
    # Allow role read-only rights in the account to run "terraform plan"
    resource "aws_iam_role_policy_attachment" "test_readonly_policy_pr_branch" {
      provider   = aws.test
      role       = aws_iam_role.test_pr_branch.name
      policy_arn = "arn:aws:iam::aws:policy/ReadOnlyAccess"
    }
    
    
    # Prod Account IAM Roles
    # ======================
    
    # Role to allow the PR branch to use this prod AWS account
    resource "aws_iam_role" "prod_pr_branch" {
      provider           = aws.prod
      name               = "PR-Branch-Infrastructure"
      assume_role_policy = data.aws_iam_policy_document.env_accounts_pr_branch_assume_role_policy.json
    }
    
    # Allow role read-only rights in the account to run "terraform plan"
    resource "aws_iam_role_policy_attachment" "prod_readonly_policy_pr_branch" {
      provider   = aws.prod
      role       = aws_iam_role.prod_pr_branch.name
      policy_arn = "arn:aws:iam::aws:policy/ReadOnlyAccess"
    }
    
    # Adding permission to the PR-Branch role in the Main account to assume the 
    # PR-Branch role in each environment account.
    data "aws_iam_policy_document" "pr_branch_assume_role_in_environment_account_policy" {
      statement {
        actions = ["sts:AssumeRole"]
        effect  = "Allow"
        resources = [
          aws_iam_role.dev_pr_branch.arn,
          aws_iam_role.test_pr_branch.arn,
          aws_iam_role.prod_pr_branch.arn
        ]
      }
    }
    
    # Since the IAM role was created as part of the bootstrapping, we need to 
    # reference it using a data source to add the additional policy to allow
    # role switching.
    data "aws_iam_role" "pr_branch" {
      name = "PR-Branch-Infrastructure"
    }
    
    resource "aws_iam_policy" "pr_branch_role_assume_environment_accounts_roles" {
      name        = "PR-Branch-Assume-Environment-Account-Role"
      path        = "/"
      description = "Policy allowing the PR branch role to assume the equivalent role in the environment accounts."
      policy      = data.aws_iam_policy_document.pr_branch_assume_role_in_environment_account_policy.json
    }
    
    resource "aws_iam_role_policy_attachment" "pr_branch_role_assume_environment_accounts_roles" {
      role       = data.aws_iam_role.pr_branch.name
      policy_arn = aws_iam_policy.pr_branch_role_assume_environment_accounts_roles.arn
    }
    

由于所有更改都通过 Terraform 来完成,因此我们将为用户(开发人员、DevOps/SRE 团队等)创建一个具有只读权限的默认角色,授予他们查看控制台的权限。下一步,设置用户。上述 IAM 角色看起来与我们之前创建的角色非常相似,但是由于信任策略更改为允许从 main-infra 账户中扮演这些角色,而不是由 CodeCatalyst 任务运行器扮演。任务运行器将扮演主账户中的角色,然后才能扮演各个环境账户中的其他角色。

 statement {
 actions = ["sts:AssumeRole"]
 effect = "Allow"
 principals {
 type = "AWS"
 identifiers = [
 "arn:aws:iam::${data.aws_organizations_organization.org.master_account_id}:root"
 ]
 }
 }

此外,我们也不需要指定主账户的 ID,因为我们可以使用 aws_organizations_organization 作为数据源来查找。

您也许已注意到,具有只读权限的角色不能访问 S3 状态文件存储桶、DynamoDB 锁定表或是用于状态文件加密的 KMS 密钥。将在工作流中发挥作用的角色有两个:工作流运行时使用的基础 IAM 角色和 AWS provider 中指定的角色。我们在配置 Terraform 后端时不会指定该角色,因此 Terraform 会使用工作流提供的角色,而创建资源的各个账户的 AWS provider 会使用我们在对应的 provider 配置中指定的角色。这样一来,无需在各个环境账户内添加额外的权限就可以更新状态文件,因为状态不是存储在环境账户中,而是存储在 main-infra 账户下。我们将同样采用同样的方法设置新的 environments-infra 存储库。

现在,我们可以运行 terraform apply 来添加新的 IAM 角色。应该有 12 个可添加的资源。添加这些资源后,我们需要在不同的环境账户中设置供开发人员使用的 IAM 角色。在继续执行下一步之前,我们需要先提交更改,确保不会意外丢失这些更改:

terraform apply
# accept changes with "yes"

git add .
git commit -m "Adding providers and IAM roles for workflows for environment accounts"
git push

设置环境账户的工作流

现在已经万事俱备,我们可以开始在环境账户中添加基础设施了。还需要创建一个新的存储库,用于存储基础设施。虽然可以通过 main-infra 存储库来完成这一操作,但是该存储库应视为机密存储库,因为该存储库中管理着所有账户的所有基础设施,应该限制访问。我们将使用名为 environments-infra 的新存储库来管理日常使用的基础设施。打开 CodeCatalyst,选择 Code(代码)> Source repositories(源存储库),然后点击 Add repository(添加存储库),创建新存储库。将存储库命名为 environments-infra,描述为 Manages the dev, test, and prod accounts' infrastructure.,并将 .gitignore file 设置为 Terraform。创建完成后,点击 Clone repository(克隆存储库),然后点击 Copy(复制),复制 git clone 所需的 URL:

Dev 环境被配置为使用 Amazon CodeCommit 凭证帮助程序,您可以原封不动地保留该配置并生成个人访问令牌,使用到 Clone repository(克隆存储库)中;也可以编辑 ~/.gitconfig 并在 [credential] 下将凭证帮助程序从 helper = !aws codecommit credential-helper $@ 更改为 helper = /aws/mde/credential-helper。这样,您在克隆 CodeCatalyst 存储库时无需每次都指定个人访问令牌。要克隆新存储库,请在您的终端中更改为 /projects 目录以克隆 environments-infra 存储库,然后键入 git clone 并将 URL 粘贴过来(具体 URL 会因用户别名、空间和项目名称而异):

# Ensure to replace the URL with your one without the "<>" brackets.
git clone<Your clone URL>

如果您没有更改凭证帮助程序,它会提醒您输入密码。在 CodeCatalyst 中,进入 Clone repository(克隆存储库)对话框,点击 Create token(创建令牌),然后点击旁边的 Copy(复制),将复制的内容作为密码粘贴到终端中。

Cloning into 'environments-infra'...
remote: Counting objects: 4, done.
Unpacking objects: 100% (4/4), 1.11 KiB | 567.00 KiB/s, done.

配置 Terraform

我们首先需要定义环境账户的后端。请在 environments-infra 中使用以下代码创建 terraform.tf。这段代码与 main-infra 存储库的代码基本相同,唯一区别是通过加入 -enviroments 更改了 key:

terraform {
 backend "s3" {
 bucket = "tf-state-files"
 key = "terraform-environments-state-file/terraform.tfstate"
 region = "us-east-1"
 dynamodb_table = "TerraformMainStateLock"
 kms_key_id = "alias/s3" # Optionally change this to the custom KMS alias you created - "alias/terraform"
 }

 required_providers {
 aws = {
 source = "hashicorp/aws"
 version = "~> 4.33"
 }
 }

 required_version = "= 1.3.7"
}

接下来,我们需要确认后端配置是否正确。方法是进入 environment-infra 目录,然后运行 terraform init,初始化 Terraform:

$ terraform init

Initializing the backend...

Successfully configured the backend "s3"! Terraform will automatically
use this backend unless the backend configuration changes.

Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 4.33"...
- Installing hashicorp/aws v4.57.0...
- Installed hashicorp/aws v4.57.0 (signed by HashiCorp)

Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

使用 variables.tf 来管理变量。注意需要将该文件放在 environments-infra 目录中。首先,添加 aws_region 和 tf_caller 变量。我们将使用 aws_caller_identity 作为数据源来确定 terraform plan 的调用者,并根据调用者是用户还是工作流来决定要使用的角色。使用以下代码创建 variable.tf:

variable "aws_region" {
 default = "us-east-1"
}

variable "tf_caller" {
 default = "User"
}

我们将 IAM 角色默认设置为 User,并在工作流中添加一个环境变量来指定是否使用具有 AdministratorAccess 权限的角色来处理 main 分支请求(并且需要创建/更新/删除基础设施的权限),或者指定是否是拉取请求,并且只需要 ReadOnly 权限。

接下来,创建 providers.tf,将 provider 与环境账户绑定。需要指定亚马逊云科技账户 ID。我们以前可以直接通过 aws_organizations_account 引用资源,而现在,我们为每个环境账户创建了 provider。由于 aws_organizations_account 资源不再托管在该存储库中,因此我们无法再引用该资源。所以,我们需要不一样的解决方案。我们将使用 Terraform 的数据源来查找 Amazon Organizations 账户,并使用第二个 AWS provider 将返回的值存储在一个 local 值中。第二个 provider 通过别名指定,并使用运行命令的凭证。如果是在工作流中,则返回的值为该工作流的 IAM 角色;如果调用者是用户,则返回的值为相应的 IAM 用户。我们无需为每个账户配置 provider,而是会配置默认的 AWS provider,并根据所选 Terraform 工作空间来切换账户(我们将在下一节中介绍工作空间)。因此,请务必使用与环境名称相同的工作空间名称。

在 providers.tf 文件中添加以下代码:

# Configuring the main account AWS provider
provider "aws" {
 alias = "main"
 region = var.aws_region
}

# Look up environment account using main account provider
data "aws_organizations_organization" "org" {
 provider = aws.main
}

# Retrieve the caller identity to append it to the assume role session name.
data "aws_caller_identity" "current" {
 provider = aws.main
}

locals {
 # Set environment account ID as local value based on selected Terraform
 # workspace.
 account_id = data.aws_organizations_organization.org.accounts[
 index(data.aws_organizations_organization.org.accounts.*.name, terraform.workspace)
 ].id
 
 # Sets the role to assume based on the calling identity, if it is a user,
 # default to read-only, otherwise use the role passed by the environment
 # variable (this should only be the CI runner).
 iam_role_name = startswith(var.tf_caller, "User") ? "Users-Read-Only" : var.tf_caller
}

provider "aws" {
 region = var.aws_region
 
 assume_role {
 role_arn = "arn:aws:iam::${local.account_id}:role/${local.iam_role_name}"
 session_name = "dev-account-from-main-${local.iam_role_name}"
 }
}

在进行运行测试之前,我们需要介绍一下 Terraform 工作空间,以及如何使用它们。

工作空间配置

通过使用 Terraform 工作空间,您可以在其他位置存储与工作空间关联的状态文件,从而能够将同一基础设施定义用于创建多组基础设施。请注意,我们将创建一个包含 Terraform 文件的文件夹,但会将这些文件应用于三个环境,其中每个环境都需要独立存储状态文件,以免因为我们使用相同的后端配置,而使状态被覆盖掉。工作空间可以帮助我们简化这些操作,因为它可以根据工作空间名称更改后端定义的状态文件的位置。运行以下命令创建工作空间,使用与环境相同的名称来命名工作空间:

terraform workspace new dev
terraform workspace new test
terraform workspace new prod

terraform workspace select dev

您应当会看到以下输出结果:

$ terraform workspace new dev
Created and switched to workspace "dev"!

You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.

$ terraform workspace new test
Created and switched to workspace "test"!

You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.

$ terraform workspace new prod
Created and switched to workspace "prod"!

You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.

$ terraform workspace select dev 
Switched to workspace "dev".

现在,我们可以创建和存储基础设施的状态,但是由于我们使用单个存储库来存储所有三个环境账户(dev、test、prod)的基础设施定义,因此我们一个方法来为不同的环境设置不同的配置。该方法需要确保所有三个环境账户都具有相同的基础设施,但数量和规模有所不同。为此,我们将在每个环境中使用一个变量定义文件来存储相应的值。我们需要创建空白文件并设置 CI/CD 管道的其余部分。运行以下命令创建空白变量文件:

mkdir -p env_accounts_vars

touch env_accounts_vars/dev.tfvars
touch env_accounts_vars/test.tfvars
touch env_accounts_vars/prod.tfvars

在 variable.tf 中定义新变量,并在各个文件中根据环境需求设置不同的值。这些变量以 key=value 键值形式添加到变量对应的 .tfvars 文件。例如,如果我们需要名为 db_instance_size 的变量,则可以在 dev.tfvars 中设置 db_instance_size=db.t3.micro;在 test.tfvars 中设置 db_instance_size=db.m5.large;在 prod.tfvars 中设置 db_instance_size=db.m5.8xlarge。然后通过 plan 或 apply 命令应用这些 .tfvars 文件:

# Example, do not run
terraform plan -var-file="env_accounts_vars/dev.tfvars"

此处存在一个小问题:我们没有确保选择正确的 Terraform 工作空间。为了确保这一点,我们需要运行以下命令:

# Example, do not run
terraform workspace select dev
terraform plan -var-file="env_accounts_vars/dev.tfvars"

这个方法虽然可以处理使用多个环境的问题,但却需要进行大量文本键入或复制粘贴操作,而且非常容易将错误的变量合并到环境中,例如:

# Example, do not run
terraform workspace select dev
terraform plan -var-file="env_accounts_vars/prod.tfvars"

将 prod 环境的值用于 dev 环境。幸好,有一个办法可以让该操作变得更加简单也更加安全。

命令的 Makefile

我们将创建一个 Makefile 来对命令进行封装,并定义 init、plan 和 apply 命令。使用以下命令,在项目的根目录中创建一个 Makefile(没有文件扩展名):

wget https://raw.githubusercontent.com/build-on-aws/manage-multiple-environments-with-terraform-environment-infra/main/Makefile

Makefile 应包含以下内容(请注意,缩进需使用的是 Tab 键实现,而不是空格):

.PHONY: default validate plan apply destroy

default:
 @echo ""
 @echo "Runs Terraform validate, plan, and apply wrapping the workspace to use"
 @echo ""
 @echo "The following commands are available:"
 @echo " - plan : runs terraform plan for an environment"
 @echo " - apply : runs terraform apply for an environment"
 @echo " - destroy : will delete the entire project's infrastructure"
 @echo ""
 @echo "The "ENV" environment variable needs to be set to dev, test, or prod."
 @echo ""
 @echo "Exmple usage:"
 @echo " EVN=dev make plan"
 @echo ""

validate:
 $(call check_defined, ENV, Please set the ENV to plan for. Values should be dev, test, or prod)

 @echo "Initializing Terraform ..."
 @terraform init
 
 @echo 'Creating the $(value ENV) workspace ...'
 @-terraform workspace new $(value ENV)

 @echo 'Switching to the [$(value ENV)] environment ...'
 @terraform workspace select $(value ENV)

 @terraform validate -no-color

plan:
 $(call check_defined, ENV, Please set the ENV to plan for. Values should be dev, test, or prod)

 @echo "Initializing Terraform ..."
 @terraform init
 
 @echo 'Creating the $(value ENV) workspace ...'
 @-terraform workspace new $(value ENV)

 @echo 'Switching to the [$(value ENV)] workspace ...'
 @terraform workspace select $(value ENV)

 @terraform plan \
 -var-file="env_accounts_vars/$(value ENV).tfvars" -no-color -input=false \
 -out $(value ENV).plan


apply:
 $(call check_defined, ENV, Please set the ENV to apply. Values should be dev, test, or prod)

 @echo 'Creating the $(value ENV) workspace ...'
 @-terraform workspace new $(value ENV)

 @echo 'Switching to the [$(value ENV)] workspace ...'
 @terraform workspace select $(value ENV)

 @terraform apply -auto-approve -no-color -input=false \
 -var-file="env_accounts_vars/$(value ENV).tfvars"


destroy:
 $(call check_defined, ENV, Please set the ENV to apply. Values should be dev, test, or prod)

 @echo "Initializing Terraform ..."
 @terraform init
 
 @echo 'Creating the $(value ENV) workspace ...'
 @-terraform workspace new $(value ENV)

 @echo 'Switching to the [$(value ENV)] workspace ...'
 @terraform workspace select $(value ENV)

 @echo "## ##"
 @echo "Are you really sure you want to completely destroy [$(value ENV)] environment ?"
 @echo "## ##"
 @read -p "Press enter to continue"
 @terraform destroy \
 -var-file="env_vars/$(value ENV).tfvars"


# Check that given variables are set and all have non-empty values,
# die with an error otherwise.
#
# Params:
# 1. Variable name(s) to test.
# 2. (optional) Error message to print.
check_defined = \
 $(strip $(foreach 1,$1, \
 $(call __check_defined,$1,$(strip $(value 2)))))
__check_defined = \
 $(if $(value $1),, \
 $(error Undefined $1$(if $2, ($2))))

Makefile 中使用的命令和语法说明:

  • .PHONY:make 用于编译文件。如果创建名为 plan、apply 或 destroy 的文件,make 会失败。使用 .PHONY 可解决该问题。更多详情,请参阅 GNU 文档中的相关内容
  • @:为命令添加 @ 前缀,使 make 不在运行命令之前打印命令。
  • -:也可以通过为命令添加 -,让 make 即使在失败的情况下也继续运行,详见下文。
  • $(call check_defined, ENV ...:调用文件底部定义的函数,以确保在有 ENV 环境变量的情况下,如果没有指定该变量,就会出错并退出。

当您查看不同的目标命令(validate、plan、apply或destroy)时,可以看到每次都会调用 init、workspace new 和workspace select。您也许会问:如果该操作只需执行一次,为什么又每次都要执行呢?原因是为了避免按特定顺序运行步骤。无论是在工作流中,还是对于新团队新成员,都要避免这一情况。该方法虽然用时稍长,但在有人忘记按要求完成相关步骤时,可以避免发生问题。

因此,我们现在可以使用以下命令,对正确的工作空间和 .tfvars 文件运行 terraform plan:

ENV=dev make plan

我们建议您运行 ENV=dev make plan 来测试 Makefile。如果收到错误消息,请检查代码中的缩进是否使用的是空格。请务必使用 Tab 键实现缩进。现在,我们已准备就绪,可以为 main 分支和 PR 分支设置工作流,并将代码中的更改应用于我们的基础设施部署。输出应包含以下内容:

$ ENV=dev make plan

Initializing Terraform ...
Initializing modules...

Initializing the backend...

Initializing provider plugins...
- Reusing previous version of hashicorp/aws from the dependency lock file
- Using previously-installed hashicorp/aws v4.57.0

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
Creating the dev workspace ...
Workspace "dev" already exists
make: [plan] Error 1 (ignored)
Switching to the [dev] workspace ...

按顺序运行时,您会注意到输出中包含一个错误。这属于正常现象,原因是我们在 Makefile 中设置了始终运行 terraform workspace new 命令。

# Example of error output

Creating the dev workspace ...
Workspace "dev" already exists
make: [plan] Error 1 (ignored)
Switching to the [dev] workspace ...

工作流

使用之前在 Main-Infra 存储库中操作的类似方法创建一个工作流。该工作流在 main 分支发生更改时触发,从而将更改应用到我们的基础设施部署中。然后,我们还将创建一个工作流,用于在新收到的拉取请求后运行 plan,以确保不存在任何错误。运行以下命令创建工作流:

mkdir -p .codecatalyst/workflows

wget https://raw.githubusercontent.com/build-on-aws/manage-multiple-environments-with-terraform-environment-infra/main/.codecatalyst/workflows/main_branch.yml
wget https://raw.githubusercontent.com/build-on-aws/manage-multiple-environments-with-terraform-environment-infra/main/.codecatalyst/workflows/pr_branch.yml

每个文件的内容:

  • main_branch.yml
  • Name: Environment-Account-Main-Branch
    SchemaVersion: "1.0"
    
    Triggers:
     - Type: Push
     Branches:
     - main
    
    Actions:
     Terraform-Main-Branch-Apply:
     Identifier: aws/build@v1
     Inputs:
     Sources:
     - WorkflowSource
     Variables:
     - Name: TF_VAR_tf_caller
     Value: Main-Branch-Infrastructure
     Environment:
     Connections:
     - Role: Main-Branch-Infrastructure
     Name: "111122223333" # Replace with your AWS Account ID here.
     Name: MainAccount
     Configuration:
     Steps:
     - Run: export TF_VERSION=1.3.7 && wget -O terraform.zip "https://releases.hashicorp.com/terraform/${TF_VERSION}/terraform_${TF_VERSION}_linux_amd64.zip"
     - Run: unzip terraform.zip && rm terraform.zip && mv terraform /usr/bin/terraform && chmod +x /usr/bin/terraform
     # We will run plan for each environment before we run apply as mistakes
     # can still happen, and we don't want plan for test or prod to fail
     # after we applied changes to dev.
     - Run: ENV=dev make plan
     - Run: ENV=test make plan
     - Run: ENV=prod make plan
     - Run: ENV=dev make apply
     - Run: ENV=test make apply
     - Run: ENV=prod make apply
     Compute:
     Type: EC2
  • pr_branch.yml
  • Name: Environment-Account-PR-Branch
    SchemaVersion: "1.0"
    
    Triggers:
      - Type: PULLREQUEST
        Events:
          - OPEN
          - REVISION
    
    Actions:
      Terraform-PR-Branch-Plan:
        Identifier: aws/build@v1
        Inputs:
          Sources:
            - WorkflowSource
          Variables:
            - Name: TF_VAR_tf_caller
              Value: PR-Branch-Infrastructure
        Environment:
          Connections:
            - Role: PR-Branch-Infrastructure
              Name: "111122223333" # Replace with your AWS Account ID here.
          Name: MainAccount
        Configuration: 
          Steps:
            - Run: export TF_VERSION=1.3.7 && wget -O terraform.zip "https://releases.hashicorp.com/terraform/${TF_VERSION}/terraform_${TF_VERSION}_linux_amd64.zip"
            - Run: unzip terraform.zip && rm terraform.zip && mv terraform /usr/bin/terraform && chmod +x /usr/bin/terraform
            - Run: ENV=dev make plan
            - Run: ENV=test make plan
            - Run: ENV=prod make plan
        Compute:
          Type: EC2
    

请务必将文件中的 111122223333 更改为您自己的亚马逊云科技账户 ID。

在提交并推送所有更改,启动工作流之前,务必将 *.tfvars 文件包含进来。正如文件中的注释说明,我们不推荐这种方法,因为它们的典型使用场景是特定于开发者/本地覆盖。在我们的案例中,我们希望实现版本控制,因此我们会添加 -f 标志,强制 git 添加这些文件。任何更改都会被 git 捕获,但它仍会忽略新增文件。用户仍然可以在本地使用这些文件。运行以下命令将 .tfvars 文件添加到 environments-infra 目录中:

git add -f env_accounts_vars/*.tfvars

现在,我们已经完成了所有基本设置。接下来,让我们提交所有更改。运行以下命令:

git add .
git commit -m "Base workflows and bootstrapping added"
git push

现在,您可以导航至 CI/CD > Workflows(工作流),然后在第一个下拉菜单中选择 All repositories(所有存储库)。您应该会看到 Main-Account 和 Environment-Accounts 都有四个构建作业、一个 main 分支和一个 PR 分支。

使用工作流

现在环境账号都设置完成,下面我们在每个环境账户中创建一个 VPC。首先,创建我们接下来要使用的分支,并使用我们的 PR 工作流进行测试。在 environments-infra 存储库中,运行以下命令:

git checkout -b add-vpc

我们将使用 terraform-aws-modules/vpc/aws 模块来创建新文件 vpc.tf,并将以下代码添加到该文件中:

module "vpc" {
 source = "terraform-aws-modules/vpc/aws"

 name = "CodeCatalyst-Terraform"
 cidr = var.vpc_cidr

 azs = ["${var.aws_region}a", "${var.aws_region}b", "${var.aws_region}c"]
 public_subnets = var.public_subnet_ranges

 enable_nat_gateway = false
 enable_vpn_gateway = false
}

添加两个新变量 vpc_cidr 和 public_subnet_ranges 到 variables.tf 中:

variable "vpc_cidr" {
 type = string
}

variable "public_subnet_ranges" {
 type = list
}

最后,在每个环境的 tfvars 文件中,我们都需要设置这两个新变量的值。为每个账户设置不同的 IP 范围以示区别。如果您打算在这些账户之间设置 VPC 对等连接,这种做法也同样可取,因为使用对等连接时 IP 范围不能重叠。将以下代码添加到对应的 .tfvars 文件中。请点击对应的文件名标签页,查看对应的代码:

  • dev.tfvars
  • vpc_cidr="10.0.0.0/16"
    public_subnet_ranges=[ "10.0.10.0/24", "10.0.11.0/24", "10.0.12.0/24"]
  • test.tfvars
  • vpc_cidr="10.30.0.0/16"
    public_subnet_ranges=[ "10.30.10.0/24", "10.30.11.0/24", "10.30.12.0/24"]
  • prod.tfvars
  • vpc_cidr="10.50.0.0/16"
    public_subnet_ranges=[ "10.50.10.0/24", "10.50.11.0/24", "10.50.12.0/24"]

将对 VPC 的更改提交至分支,然后创建新的拉取请求:

git add .
git commit -m "Add new VPC"
git push --set-upstream origin add-vpc

导航至 Code(代码)> Source repositories(源存储库),点击 environments-infra,点击 Actions(操作)下拉菜单,然后选择 Create pull request(创建拉取请求)。将 Source branch(源分支)设置为 add-vpc,并将 Destination branch(目标分支)设置为 main,然后添加描述性的 Pull request title(拉取请求标题)和 Pull request description(拉取请求描述)。创建完成后,导航至 CI/CD > Workflow(工作流),并在第一个下拉菜单中选择 All repositories(所有存储库)。您应该会在 Recent runs(当前运行)下。看到 add-vpc 分支的 Environment-Account-PR-Branch 工作流正在运行。点击该工作流可查看进度。成功完成所有步骤之后,展开 ENV=dev make plan 步骤查看 plan 步骤的输出。该输出会显示创建 VPC 的计划资源。同样,您也可以查看 test 和 prod 环境的基础设施。

查看所做的更改后,导航至 Code(代码)>Pull requests(拉取请求),然后点击 PR。点击 Merge(合并),接受合并。现在可以导航至 CI/CD >Workflows(工作流),从第一个下拉菜单中选择 All repositories(所有存储库),然后在 Recent runs(当前运行)下选择 main 分支上的 Environment-Account-Main-Branch 作为当前运行的工作流。该工作流运行完成后,在每个账户中创建了一个 VPC。

清理资源

本课程即将结束,您可以保留当前设置并在此基础上进行扩展,也可以删除刚才创建的所有资源。请按照以下步骤清理您的环境:

1. 在 environments-infra 中,运行 git checkout main 和 git pull,确认您使用的是最新版本,然后执行以下操作:

  1. 编辑 providers.tf 并将 aws provider 中的 role_arn 值从 "arn:aws:iam::${local.account_id}:role/${local.iam_role_name}" 更改为 "arn:aws:iam::${local.account_id}:role/Org-Admin"。由于我们将开发人员角色限制为仅具有读取权限,因此我们需要使用管理员角色来完成这一步操作。
  2. ENV=prod make destroy 并确认删除 prod 环境中的所有资源。
  3. ENV=test make destroy 并确认删除 test 环境中的所有资源。
  4. ENV=dev make destroy 并确认删除 dev 环境中的所有资源。

2. 在 main-infra 中,运行 git checkout main 和 git pull,确认您使用的是最新版本,然后执行以下操作:

  1. 运行 terraform destroy 并确认删除成功。
  2. 编辑 _bootstrap/state_file_resources.tf,将 aws_s3_bucket 资源替换为以下内容:
resource "aws_s3_bucket" "state_file" {
 bucket = var.state_file_bucket_name

 force_destroy = true

 lifecycle {
 prevent_destroy = false
 }
}

3. 运行 cd _bootstrap && terraform destroy。该命令会报错,因为在其运行期间移除 S3 存储桶和 DynamoDB 表会导致其尝试保存状态数据时,相关资源已经不复存在。

总结

恭喜您!现在,您已通过 CodeCatalyst 使用 Terraform 在亚马逊云科技上设置了一个多环境 CI/CD 管道,并使用拉取请求工作流来部署基础设施更改。