AWS DevOps Blog

Part 3: Develop, Deploy, and Manage for Scale with Elastic Beanstalk and CloudFormation

Today’s Topic: Application Config and Regional Portability with AWS Elastic Beanstalk

Welcome to the 3rd part of this 5-part series where we’ll cover best-practices and practical tips & tricks for developing, deploying, and managing a web application with an eye for application performance and operational efficiency using AWS CloudFormation and Elastic Beanstalk.

We’ll introduce a new part of the series as a blog post each Monday, and discuss the post as well as take questions during an Office Hours-style Hangout that Thursday. All application source and accompanying CloudFormation templates are available on GitHub at http://github.com/awslabs/amediamanager


Last week (blog post and Office Hours video) we looked at building a VPC with CloudFormation and deploying the aMediaManager application into it.

This week in the third part of the series we’re going drill down from the infrastructure level to the application layer. We’ll focus on two specific characteristics of the aMediaManager application:

  1. How we use S3 to dynamically generate, store, and access the configuration that our application requires (and how do we work with something simpler for local dev?)
  2. How to build an app that works equally well in us-east-1 and ap-southeast-2?

The Hangout

We’ll be discussing this blog post – including your Q&A – during a live Office Hours Hangout at 9a Pacific on Thursday, April 24, 2014. Sign up at https://plus.google.com/events/ceoe036ugsu6hndr6ncl616gr3k.

If You’re Just Joining Us

If this is the first post you’ve read in the series, be sure to check out Part 1 or Part 2 for more info on the app, including basic functionality and how to deploy it yourself.

Application Configuration

Our aMediaManager application has a lot of configuration data it needs to know about: RDS database connection info, DynamoDB table names, S3 bucket names, ElastiCache connection info, SQS and SNS names, etc. We handle config in a very straightforward way: a key=value-style property file and a simple config class that exposes the values to our app.

Here’s an example of an aMediaManager config file in its final form:

AWS_REGION=us-east-1
CACHE_ENABLED=true
CACHE_ENDPOINT=localhost
CACHE_PORT=11211
DDB_USERS_TABLE=amm-classic-2-UsersTable-1ES2BN0W8M47Z
DEFAULT_PROFILE_PIC_KEY=static/img/default_profile_pic.png
DEFAULT_VIDEO_POSTER_KEY=static/img/default_poster.png
RDS_DATABASE=amediamanager
RDS_INSTANCEID=adpbd6lj1fx1ht
RDS_PASSWORD=haha_yea_right
RDS_USERNAME=admin
S3_PROFILE_PIC_PREFIX=profile
S3_UPLOAD_BUCKET=amm-classic-2-appbucket-vgi84gfxykp4
S3_UPLOAD_PREFIX=uploads
TRANSCODE_PIPELINE=1395356294937-9q7qvk
TRANSCODE_PRESET=1395356295286-tq1i7v
TRANSCODE_QUEUE=https://sqs.us-east-1.amazonaws.com/960309676616/amm-classic-2-TranscodeQueue-1ANACHHGT0W7G
TRANSCODE_ROLE=arn:aws:iam::960309676616:role/amm-classic-2-TranscodeRole-N5CHHEH57VBI
TRANSCODE_TOPIC=arn:aws:sns:us-east-1:960309676616:amm-classic-2-TranscodeTopic-1K56DZRDLQ9X9

Loading the App Configuration File

We wanted configuration for this application to be really simple and flexible. It should be possible to create a config file and use it in local dev, but it should also be possible to store the config file in S3 and point our application to its remote location and load it from there.

We defined the com.amediamanager.config.ConfigurationProvider abstract class and built three implementations to fit those scenarios:

  1. com.amediamanager.config.ClassResourceConfigurationProvider – This will load a local file using getClass().getResourceAsStream(someFileName.properties) to be used for configuration.
  2. com.amediamanager.config.S3EnvConfigurationProvider – This implementation will load the application configuration file from S3. It expects the S3 bucket and object key that point to the file to be available as OS env vars named S3_CONFIG_BUCKET and S3_CONFIG_KEY, respectively. The implementation extends com.amediamanager.config.S3ConfigurationProvider which provides the basic S3 implementation.
  3. com.amediamanager.config.S3FileConfigurationProvider – This implementation will also load the application configuration file from S3, but it expects the S3 bucket and object key that point to the application configuration file to be stored in a local file at src/main/resources/s3config.properties. An example empty file looks like:

    S3_CONFIG_BUCKET=
    S3_CONFIG_KEY=
    


We also want to make it simle to mix and match the configuration provider/implementation that you use. I like to use the ClassResourceConfigurationProvider when I’m developing locally in Eclipse, but when I deploy to Elastic Beanstalk I expect the configuration file to live in S3 and use environment variables to tell that application where to look (and thus expect it to use S3EnvConfigurationProvider to find its config). To help mix and match providers we built com.amediamanager.config.ConfigurationProviderChain, which allows us to chain together multiple com.amediamanager.config.ConfigurationProviders.

Finally, we have a utility class – com.amediamanager.config.ConfigurationSettings – that chooses the correct ConfigurationProvider from the chain and lets us actually read (or update) a configuration value.

Looking at the source we can see how we defined the ConfigurationProviderChain in the ConfigurationSettings class to look for config in 3 different providers (i.e., S3EnvConfigurationProvider > S3FileConfigurationProvider > ClassResourceConfigurationProvider).

We’re using Spring 3, so we can inject and use the ConfigurationSettings object in our app. So if I want to read, for example, the configuration value for the DynamoDB table we use to store user profile information, I would inject (using @Autowired) the object and look up the value:

@Component
public class SomeCoolClass {
  @Autowired
  private ConfigurationSettings config;

  public String getTheDynamoDbUsersTableName() {
    return config.getProperty(ConfigProps.DDB_USERS_TABLE);
  }
}

Manually Creating the App Config File

Remember that we built a CloudFormation template that creates all of the resources (i.e., RDS database, DynamoDB table, SNS topic, S3 bucket, etc) that our application expects to know about via its configuration file. Here’s a visual of some of those resources:

We even included a scaffold configuration file at src/main/resources/app.properties.default that you could populate by hand. To do that I would copy app.properties.default to app.properties in Eclipse, then locate the CloudFormation stack created by the amm_resources.cfn.json template file, look at the Outputs tab, and transcribe those values to the app.properties.default file in my IDE:

Automatically Generating the App Config File

Good news! Although it can sometimes be necessary or convenient to manually create a big configuration file, we can use Elastic Beanstalk to automate the whole thing. Let’s see how.

Getting the Config to Elastic Beanstalk

Recall that we use CloudFormation to provision every part of our environment. Let’s look back at the 3 stacks we used to deploy aMediaManager to a non-VPC environment:

Step 2 in the above diagram is where all the resources that need to be in our configuration file get created. Step 3 is where CloudFormation creates the Elastic Beanstalk application to deploy our app. Here’s how we’re going to automate generating that application configuration file:

  1. When the amm-master.cfn.json (Step 1) stack creates the amm-elasticbeanstalk.cfn.json stack in the 3rd and final step, we pass in all of the Outputs from amm-resources.cfn.json (Step 2) into it. Here’s how we wire that together in the master amm-master.cfn.json template:

    "App1" : {
      "Type" : "AWS::CloudFormation::Stack",
      "Properties" : {
        "TemplateURL" : { "https://s3.amazonaws.com/path/to/amm-elasticbeanstalk.cfn.json",
        "Parameters" : {
          "RdsDbId"                : { "Fn::GetAtt" : [ "AppResources", "Outputs.RdsDbId" ]},
          "CacheEndpoint"          : { "Fn::GetAtt" : [ "AppResources", "Outputs.CacheEndpoint" ]},
          "CachePort"              : { "Fn::GetAtt" : [ "AppResources", "Outputs.CachePort" ]},
          "AppBucket"              : { "Fn::GetAtt" : [ "AppResources", "Outputs.AppBucket" ]},
          "TranscodeTopic"         : { "Fn::GetAtt" : [ "AppResources", "Outputs.TranscodeTopic" ]},
          "TranscodeQueue"         : { "Fn::GetAtt" : [ "AppResources", "Outputs.TranscodeQueue" ]},
          "TranscodeRoleArn"       : { "Fn::GetAtt" : [ "AppResources", "Outputs.TranscodeRoleArn" ]},
          "UsersTable"             : { "Fn::GetAtt" : [ "AppResources", "Outputs.UsersTable" ]},
          "InstanceSecurityGroup"  : { "Fn::GetAtt" : [ "AppResources", "Outputs.InstanceSecurityGroup" ]},
          ...
    
  2. Now, in the amm-elasticbeanstalk.cfn.json template we reference those input parameters and define OptionSettings in our Elastic Beanstalk Environment that will make all of the values available as OS Environment Variables to the EC2 Instances that our application gets deployed to (I’ve abbreviated the list of variables in the below snippet):

    "Environment": {
      "Type": "AWS::ElasticBeanstalk::Environment",
      "Properties": {
        "ApplicationName": "...",
        "EnvironmentName" : "...",
        ...
        "OptionSettings": [
          {
            "Namespace": "aws:elasticbeanstalk:application:environment",
            "OptionName": "AMM_RDS_INSTANCEID",
            "Value": { "Ref": "RdsDbId" }
          },
          {
            "Namespace": "aws:elasticbeanstalk:application:environment",
            "OptionName": "AMM_CACHE_ENDPOINT",
            "Value": { "Ref": "CacheEndpoint" }
          },
          {
            "Namespace": "aws:elasticbeanstalk:application:environment",
            "OptionName": "AMM_DDB_USERS_TABLE",
            "Value": { "Ref": "UsersTable" }
          }
          ...
      }
    },
    

    Also note that we introduced a convention, adding the AMM_ prefix to each environment variable. This will make it easy to identify the values. We can see how these values materialize once the Elastic Beanstalk Environment has been created by viewing them in the Configuration section of Management Console:

Getting the Config from Elastic Beanstalk to S3

Each EC2 Instance that Elastic Beanstalk deploys our application to has all of these environment variables available to them. Now we want just one of those instances to aggregate the values into a file and upload to S3!

Elastic Beanstalk makes it easy to run scripts on the EC2 instances our application is deployed to. We create a special .ebextensions directory in our application source and put two files in it:

  1. deploy_config.py – This simple Python script we wrote looks for all of the AMM_ prefixed environment variables on the system, turns them into a key=value properties file, and uploads that file to S3.
  2. 01_app_config.config – Elastic Beanstalk recognizes and processes files in the .ebextensions folder with the .config extension and processes them. In this scenario, we use 01_app_config.config to tell the EC2 Instances created by Elastic Beanstalk to run the deploy_config.py. And in this case, we only need that script executed from one instance (assuming there could be more than one instance in an Auto Scaling Group), so we use the leader_only key to indicate that:

    container_commands:
      create_config_file:
        command: "python WEB-INF/.ebextensions/deploy_config.py"
        leader_only: true
    

    Note: There are a few other config files in the .ebextensions folder that you can browse on GitHub.

After one of the Instances in our Elastic Beanstalk environment runs the deploy_config.py, a file with all of our app config will be stored in S3.

How the App in Elastic Beanstalk Accesses the Configuration in S3

At this point, CloudFormation has created our resources, deployed our Elastic Beanstalk environment, and that environment uploaded the configuration to S3. How does our Java app know how (or where) to load that config from? Remember earlier that we defined the S3EnvConfigurationProvider implementation that would look for S3_CONFIG_BUCKET and S3_CONFIG_KEY in the environment and load config from there.

We provided those pointers to the Elastic Beanstalk Environment when we defined it in the CloudFormation template. Check out the source on GitHub to see the relevant lines in the CloudFormation template. I can find my running environment in the Management Console and view the values there, as well:

Using S3 Config in Local Dev

I’d much rather use the application config file that was created and stored automatically in S3 than have to create it by hand. It’s really easy to do. Remember the S3FileConfigurationProvider? That let us create a s3config.properties file in our project that had pointers to the file in S3. I simply copy the values from the above screenshot and paste them into that file in S3, and my local dev environment will use the config and resources automatically created by CloudFormation:

A Note About Security

The application config stored in S3 is private by default. This means both our app in both local dev and deployed to Elastic Beanstalk will need AWS credentials with proper rights to access it. When you deploy the Elastic Beanstalk environment with CloudFormation, an IAM Role is created with the appropriate permissions and the application source automatically uses that IAM Role to make authenticated calls to access S3.

For your local dev environment, you will need to configure an Access Key and Secret Key that the application can use to make authenticated AWS API requests. I added my AK and SK as JVM args to the Tomcat Run Configuration using Eclipse:

If you’re curious about credential management, check out the ServerConfig source on GitHub to see how we leveraged the AWSCredentialsProviderChain feature of the AWS SDK for Java to use a variety of different credentials providers, or come to the Hangout on Thursday for more detail! In the meantime, here’s a snippet:

@Bean
@Scope(WebApplicationContext.SCOPE_APPLICATION)
public AWSCredentialsProvider credentials() {
    return new AWSCredentialsProviderChain(
            new EnvironmentVariableCredentialsProvider(),
            new SystemPropertiesCredentialsProvider(),
            new InstanceProfileCredentialsProvider()
            );
}

Regional Portability

Last week in Part 2 of this series we talked about patterns for building CloudFormation templates that work well in multiple regions. Since we’ve made it easy to deploy this app to any of 8 AWS Regions, we should be sure that our app code behaves well when using resources like DynamoDB tables and SQS queues in different regions.

The key to regional portability lies in how we use the AWS SDK for Java to create the Client objects that we use to interact with other AWS Services. Let’s imagine we want to connect to DynamoDB. It’s easy to create a DynamoDBClient object and get to work:

// Create the client
AmazonDynamoDBClient dynamoClient = new AmazonDynamoDBClient();

// Build a request
GetItemRequest getItemRequest = new GetItemRequest()
                    .withTableName(config.getProperty(TABLE_NAME))
                    .addKeyEntry("email", new AttributeValue("bender@ilovebender.com"));

// Issue the request to find the User in DynamoDB
GetItemResult getItemResult = dynamoClient.getItem(getItemRequest);

However, because we didn’t specify otherwise, the dynamoClient we created is pointing to the DynamoDB API endpoint in us-east–1. That’s great if we’re running in us-east–1, but what if we had deployed our application to the us-west–2 Region in Oregon (which means the DynamoDB table was also created in us-west–2 and doesn’t exist in us-east–1)? Well, it just wouldn’t work!

Use com.amazonaws.regions.Region to Create Clients

Fortunately, the AWS SDK for Java makes it simple to create clients for a specific region using the com.amazonaws.regions.Region class. We also did ourselves a favor by defining the current region in the AWS_REGION configuration setting. We could change the above example to dynamically create an AmazonDynamoDBClient in the appropriate region like so:

// Create a Region object from the region name in config
Region region = Region.getRegion(Regions.fromName(settings.getProperty(ConfigurationSettings.ConfigProps.AWS_REGION)));

// Get a client from the Region object
AmazonDynamoDBClient dynamoClient = region.createClient(AmazonDynamoDBClient.class, ...);

// Build a request
GetItemRequest getItemRequest = new GetItemRequest()
                    .withTableName(
                            config.getProperty(ConfigurationSettings.ConfigProps.DDB_USERS_TABLE))
                    .addKeyEntry("email", new AttributeValue("bender@ilovebender.com"));

// Issue the request to find the User in DynamoDB
GetItemResult getItemResult = dynamoClient.getItem(getItemRequest);

We leveraged Spring in this application to make it simple to work with clients. Take a look at com.amediamanager.springconfig.ServerConfig on GitHub to see how we defined a client for each AWS service we use. Here’s a snippet:

package com.amediamanager.springconfig;

...

@Configuration
public class ServerConfig {

  @Bean
  @Scope(WebApplicationContext.SCOPE_APPLICATION)
  public MemcachedClient memcachedClient(final ConfigurationSettings settings) throws IOException {
    String configEndpoint = settings.getProperty(ConfigurationSettings.ConfigProps.CACHE_ENDPOINT);
        Integer clusterPort = Integer.parseInt(settings.getProperty(ConfigurationSettings.ConfigProps.CACHE_PORT));

        return new MemcachedClient(new InetSocketAddress(configEndpoint, clusterPort));   
  }

  @Bean
  @Scope(WebApplicationContext.SCOPE_APPLICATION)
  public Region region(final ConfigurationSettings settings) {
      return Region.getRegion(Regions.fromName(settings.getProperty(ConfigurationSettings.ConfigProps.AWS_REGION)));

  }

  @Bean
  @Scope(WebApplicationContext.SCOPE_APPLICATION)
  public AmazonElasticTranscoder transcodeClient(final AWSCredentialsProvider creds,
                                                 final Region region) {
      return region.createClient(AmazonElasticTranscoderClient.class, creds, null);
  }

  @Bean
  @Scope(WebApplicationContext.SCOPE_APPLICATION)
  public AmazonS3 s3Client(final AWSCredentialsProvider creds,
                           final Region region) {
      return region.createClient(AmazonS3Client.class, creds, null);
  }

  @Bean
  @Scope(WebApplicationContext.SCOPE_APPLICATION)
  public AmazonRDS rdsClient(final AWSCredentialsProvider creds,
                             final Region region) {
      return region.createClient(AmazonRDSClient.class, creds, null);
  }
  ...
}

Spring lets us inject these pre-configured clients in our source code, making it trivial to use AWS services in the right region in our application. Here we inject clients for S3 and DynamoDB into the DynamoDbUserDaoImpl class so we can manage user profile data:

package com.amediamanager.dao;

...

public class DynamoDbUserDaoImpl implements UserDao {

    @Autowired
    protected AmazonDynamoDB dynamoClient;

    @Autowired
    protected AmazonS3 s3Client;

    @Override
    public void save(User user) { ... }

  @Override
  public void update(User user) { ... }

  ...

Coming Up: Part 4

First, don’t forget to join us for the live Office Hours Hangout later this week (or view the recording if it’s past April 24 2014 and you don’t have a time machine).

Next week in Part 4 of this series (blog post and Office Hours links forthcoming at http://blogs.aws.amazon.com/application-management) we’ll look at how we used Amazon S3 and Elastic Transcoder Service to scale video upload and conversion jobs for the aMediaManager application.