Terraform: Beyond the Basics with AWS

Editor’s note: This post was updated in March 2018.

By Josh Campbell and Brandon Chavis, Partner Solutions Architects at AWS

Terraform by HashiCorp, an AWS Partner and member of the AWS DevOps Competency, is an infrastructure as code (IaC) tool similar to AWS CloudFormation that allows you to create, update, and version your Amazon Web Services (AWS) infrastructure.

Terraform has a great set of features that make it worth adding to your tool belt, including:

Friendly custom syntax, but also has support for JSON.
Visibility into changes before they actually happen.
Built-in graphing feature to visualize the infrastructure.
Understands resource relationships. One example is failures are isolated to dependent resources while non-dependent resources still get created, updated, or destroyed.
Open source project with a community of thousands of contributors who add features and updates.
The ability to break down the configuration into smaller chunks for better organization, re-use, and maintainability. The last part of this article goes into this feature in detail.

Keeping Secrets

You can provide Terraform with an AWS access key directly through the provider, but we recommend that you use a credential profile already configured by one of the AWS Software Developer Kits (SDKs). This prevents you from having to maintain secrets in multiple locations or accidentally committing these secrets to version control.

In either scenario, you’ll want to be sure to read our best practices for maintaining good security habits. Alternatively, you can run Terraform from one or more control servers that use an AWS Identity and Access Management (IAM) instance profile.

Each instance profile should include a policy that provides the appropriate level of permissions for each role and use case. For example, a development group may get a control server with an attached profile that enables them to run Terraform plans to create needed resources like Elastic Load Balancers and AWS Auto Scaling groups, but not resources outside the group’s scope like Amazon Redshift clusters or additional IAM roles. You’ll need to plan your control instances carefully based on your needs.

To use an instance or credential profile with Terraform, inside your AWS provider block simply remove the access_key and secret_key declarations and any other variables that reference access and secret keys. Terraform will automatically know to use the instance or credential profile for all actions.

If you plan to share your Terraform files publicly, you’ll want to use a terraform.tfvars file to store sensitive data or other data you don’t want to make public. Make sure this file is excluded from version control (for example, by using .gitignore).

The file can be in the root directory and might look something like this:

region = “us-west-2” keypair_name = “your_keypair_name” corp_ip_range = “192.168.1.0/24” some_secret = “your_secret”

Building Blocks

An advantage of using an infrastructure as code tool is that your configurations also become your documentation. Breaking down your infrastructure into components makes it easier to read and update your infrastructure as you grow. This, in turn, helps makes knowledge sharing and bringing new team members up to speed easier.

Because Terraform allows you to segment chunks of infrastructure code into multiple files (more on this below), it’s up to you to decide on a logical structure for your plans. With this in mind, one best practice could be to break up Terraform files by microservice, application, security boundary, or AWS service component.

For example, you might have one group of Terraform files that build out an Amazon Elastic Container Service (Amazon ECS) cluster for your inventory API and another group that builds out the AWS Elastic Beanstalk environment for your production front-end web application.

Additionally, Terraform supports powerful constructs called modules that allow you to re-use infrastructure code. This enables you to provide infrastructure as building blocks that other teams can leverage. For example, you might create a module for creating Amazon Elastic Compute Cloud (Amazon EC2) instances that uses only the instance types your company has standardized on. A service team can then include your module and automatically be in compliance. This approach creates enablement and promotes self-service.

Organizing Complex Services with Modules

Modules are logical groupings of Terraform configuration files. Modules are intended to be shared and re-used across projects, but can also be used within a project to help better structure a complex service that includes many infrastructure components. Terraform only processes files ending with the extension .tf in the current working folder, subdirectories are reserved for modules.

Modules are an excellent way to add structure to your project and accept a variety of different source options which allow versioning, GitHub, Bitbucket, and the Terraform Module Registry, among others. You can then execute these modules from a single configuration file (we’ll use main.tf for this example) in the parent directory where your sub-directories (modules) are located. Let’s examine this concept a bit closer.

Modules, like other Terraform resources, understand your order of dependencies. For example, a module to create a launch configuration will automatically run before a module that creates an Auto Scaling group, if the AWS Auto Scaling group depends on the newly created launch configuration.

Terraform allows you to reference output variables from one module for use in different modules. The benefit is that you can create multiple, smaller Terraform files grouped by function or service as opposed to one large file with potentially hundreds or thousands of lines of code. To use Terraform modules effectively, it is important to understand the interrelationship between output variables and input variables.

At a high level, these are the steps you would take to make an object in one module available to another module:

Define an output variable inside a resource configuration (module_A). The scope of resource configuration details are local to a module until declared as an output.
Declare the use of module_A’s output variable in the configuration of another module, module_B. Create a new key name in module_B and set the value equal to the output variable from module_A.
Finally, create a variables.tf file for module_B. In this file, create an input variable with the same name as the key you defined in module_B in step 2. This variable is what allows dynamic configuration of resource(s) in a module. Because this variable is limited to module_B in scope, you need to repeat this process for any other module that needs to reference module_A’s output.

As an example, let’s say we’ve created a module called load_balancers that defines an Elastic Load Balancer. After declaring the resource, we add an output variable for the ELB’s name:

output "elb_name" { value = "${aws_elb.elb.name}" }

You can then reference this ELB name from another module using ${module.load_balancers.elb_name}. It’s this reference that allows Terraform to build an internal dependency graph, which in turn controls creation and destruction order. Each module (remember that a module is just a set of configuration files in their own directory) that wants to use this variable must have its own variables.tf file with an input variable of elb_name defined.

Following Along with an Example

In this section, we’ll walk you through an example project that creates an infrastructure with several components, including an Elastic Load Balancer and AWS Auto Scaling group, which will be our focus.

Main.tf

Looking at main.tf you will see that there are several modules defined. Let’s focus on the autoscaling_groups module first:

module "autoscaling_groups" { source = "./autoscaling_groups" public_subnet_id = "${module.site.public_subnet_id}" webapp_lc_id = "${module.launch_configurations.webapp_lc_id}" webapp_lc_name = "${module.launch_configurations.webapp_lc_name}" webapp_elb_name = "${module.load_balancers.webapp_elb_name}" }

The first thing to notice is the line source = "./autoscaling_groups". This simply tells Terraform that the source files for this module are in the autoscaling_groups subdirectory. Modules can be local folders as they are above, or they can come from other sources like an Amazon Simple Storage Service (Amazon S3) bucket, the Terraform Module Registry, or a different Git repository. This example assumes you will run all Terraform commands from the parent directory where main.tf exists.

Autoscaling_groups Modules

If you then examine the autoscaling_groups directory you’ll notice that it includes two files: variables.tf and webapp-asg.tf. Terraform will run any .tf files it finds in the Module directory, so you can name these files whatever you want. Now look at line 20 of autoscaling_groups/webapp-asg.tf:

load_balancers = ["${var.webapp_elb_name}"]

Here we’re setting the load_balancers parameter to an array that contains a reference to the variable webapp_elb_name. If you look back at main.tf, you’ll notice that this name is also part of the configuration of the autoscaling_groups module. Looking in autoscaling_groups/variables.tf, you’ll see this variable declared with empty curly braces ({}). This is the magic behind using outputs from other modules as input variables.

Load_balancers Modules

To bring it together, examine load_balancers/webapp-elb.tf and find this section:

output "webapp_elb_name" { value = "${aws_elb.webapp_elb.name}" }

Here we’re telling Terraform to output a variable named webapp_elb_name, whose value is equal to our ELB name as determined by Terraform after the ELB is created for us.

Summary

In this example:

We created an output variable for the load_balancers module named webapp_elb_name in load_balancers/webapp-elb.tf.
In main.tf under the autoscaling_groups module configuration, we set the webapp_elb_name key to the output variable of the same name from the load_balancers module as described above. This is how we reference output variables between modules with Terraform.
Next, we defined this input variable in autoscaling_groups/variables.tf by simply declaring variable webapp_elb_name {}. Terraform automatically knows to set the value of webapp_elb_name to the output variable from the load_balancers module, because we declared it in the configuration of our autoscaling_groups module in step 2.
Finally, we’re able to use the webapp_elb_name variable within autoscaling_groups/webapp-asg.tf.

Collaborating with Teams

Chances are, if you’re using Terraform to build production infrastructure, you’re not working alone. If you need to collaborate on your Terraform templates, the best way to sync is by using Terraform Enterprise by HashiCorp.

Terraform Enterprise allows your infrastructure templates to be version controlled, audited, and automatically deployed based on workflows you configure. There’s a lot to talk about when it comes to Terraform Enterprise, so we’ll save the deep dive for our next blog post.

Wrapping Up

We hope we’ve given you a good idea of how you can leverage the flexibility of Terraform to make managing your infrastructure less difficult. By using modules that logically correlate to your actual application or infrastructure configuration, you can improve agility and increase confidence in making changes to your infrastructure.

Take a look at Terraform by HashiCorp today: https://www.terraform.io/