AWS Developer Blog

Using Amazon SQS with Spring Boot and Spring JMS

by Magnus Bjorkman | on | in Java | Permalink | Comments |  Share

By favoring convention over configuration, Spring Boot reduces complexity and helps you start writing applications faster. Spring Boot allows you to bootstrap a framework that abstracts away many of the recurring patterns used in application development. You can leverage the simplicity that comes with this approach when you use Spring Boot and Spring JMS with Amazon SQS. Spring can be used to manage things like polling, acknowledgement, failure recovery, and so on, so you can focus on implementing application functionality.

In this post, we will show you how to implement the messaging of an application that creates thumbnails. In this use case, a client system will send a request message through Amazon SQS that includes an Amazon S3 location for the image. The application will create a thumbnail from the image and then notify downstream systems about the new thumbnail that the application has uploaded to S3.

First we define the queues that we will use for incoming requests and sending out results:

Screenshots of Amazon SQS queue configurations

Here are a few things to note:

  • For thumbnail_requests, we need to make sure the Default Visibility Timeout matches the upper limit of the estimated processing time for the requests. Otherwise, the message might become visible again and retried before we have completed the image processing and acknowledged (that is, deleted) the message.
  • The Amazon SQS Client Libraries for JMS explicitly set the wait time when it polls to 20 seconds, so the Receive Message Wait Time setting for any of the queues will not change polling behavior.
  • Because failures can happen as part of processing the image, we define a Dead Letter Queue (DLQ ) for thumbnail_requests: thumbnail_requests_dlq. Messages will be moved to the DLQ after three failed attempts. Although you can implement a separate process that handles the messages that are put on that DLQ, that is beyond the scope of this post.

Now that the queues are created, we can build the Spring Boot application. We start by creating a Spring Boot application, either command-line or web application. Then we need to add some dependencies to make this work with Amazon SQS.

The following shows the additional dependencies in Maven:

    <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-jms</artifactId>
    </dependency>

     <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk</artifactId>
        <version>1.9.6</version>
    </dependency>

    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>amazon-sqs-java-messaging-lib</artifactId>
      <version>1.0.0</version>
      <type>jar</type>
    </dependency>

This will add the Spring JMS implementation, the AWS SDK for Java and the Amazon SQS Client Libraries for JMS. Next, we will configure Spring Boot with Spring JMS and Amazon SQS by defining a Spring Configuration class using the @Configuration annotation:

@Configuration
@EnableJms
public class JmsConfig {

    SQSConnectionFactory connectionFactory =
            SQSConnectionFactory.builder()
                    .withRegion(Region.getRegion(Regions.US_EAST_1))
                    .withAWSCredentialsProvider(new DefaultAWSCredentialsProviderChain())
                    .build();


    @Bean
    public DefaultJmsListenerContainerFactory jmsListenerContainerFactory() {
        DefaultJmsListenerContainerFactory factory =
                new DefaultJmsListenerContainerFactory();
        factory.setConnectionFactory(this.connectionFactory);
        factory.setDestinationResolver(new DynamicDestinationResolver());
        factory.setConcurrency("3-10");
        factory.setSessionAcknowledgeMode(Session.CLIENT_ACKNOWLEDGE);
        return factory;
    }

    @Bean
    public JmsTemplate defaultJmsTemplate() {
        return new JmsTemplate(this.connectionFactory);
    }

}

Spring Boot will find the configuration class and instantiate the beans defined in the configuration. The @EnableJms annotation will make Spring JMS scan for JMS listeners defined in the source code and use them with the beans in the configuration:

  • The Amazon SQS Client Libraries for JMS provides the SQSConnectionFactory class, which implements the ConnectionFactory interface as defined by the JMS standard, allowing it to be used with standard JMS interfaces and classes to connect to SQS.

    • Using DefaultAWSCredentialsProviderChain will give us multiple options for providing credentials, including using the IAM Role of the EC2 instance.
  • The JMS listener factory will be used when we define the thumbnail service to listen to messages.

    • Using the DynamicDestinationResolver will allow us to refer to Amazon SQS queues by their names in later classes.
    • The values we provided to Concurrency show that we will create a minimum of 3 listeners that will scale up to 10 listeners.
    • Session.CLIENT_ACKNOWLEDGE will make Spring acknowledge (delete) the message after our service method is complete. If the method throws an exception, Spring will recover the message (that is, make it visible).
  • The JMS template will be used for sending messages.

Next we will define the service class that will listen to messages from our request queue.

@Service
public class ThumbnailerService {

    private Logger log = Logger.getLogger(ThumbnailerService.class);

    @Autowired
    private ThumbnailCreatorComponent thumbnailCreator;

    @Autowired
    private NotificationComponent notification;

    @JmsListener(destination = "thumbnail_requests")
    public void createThumbnail(String requestJSON) throws JMSException {
        log.info("Received ");
        try {
            ThumbnailRequest request=ThumbnailRequest.fromJSON(requestJSON);
            String thumbnailUrl=
                     thumbnailCreator.createThumbnail(request.getImageUrl());
            notification.thumbnailComplete(new ThumbnailResult(request,thumbnailUrl));
        } catch (IOException ex) {
            log.error("Encountered error while parsing message.",ex);
            throw new JMSException("Encountered error while parsing message.");
        }
    }

}

The @JmsListener annotation marks the createThumbnail method as the target of a JMS message listener. The method definition will be consumed by the processor for the @EnableJms annotation mentioned earlier. The processor will create a JMS message listener container using the container factory bean we defined earlier and the container will start to poll messages from Amazon SQS. As soon as it has received a message, it will invoke the createThumbnail method with the message content.

Here are some things to note:

  • We define the SQS queue to listen to (in this case thumbnail_requests) in the destination parameter.
  • If we throw an exception, the container will put the message back into the queue. In this example, the message will be eventually put onto the DLQ queue after three failed attempts.
  • We have autowired two components to process the image (ThumbnailCreatorComponent) and to send notifications (NotificationComponent) when the image has been processed.

The following shows the implementation of the NotificationComponent that is used to send a SQS message:

@Component
public class NotificationComponent {

    @Autowired
    protected JmsTemplate defaultJmsTemplate;

    public void thumbnailComplete(ThumbnailResult result) throws IOException {
        defaultJmsTemplate.convertAndSend("thumbnail_results", 
                                          result.toJSON());
    }

}

The component uses the JMS template defined in the configuration to send SQS messages. The convertAndSend method takes the name of the SQS queue that we want to send the message to.

JSON is the format in the body of our messages. We can easily convert that in our value objects. Here is an example with the class for the thumbnail request:

public class ThumbnailRequest {

    String objectId;
    String imageUrl;

    public String getObjectId() {
        return objectId;
    }

    public void setObjectId(String objectId) {
        this.objectId = objectId;
    }

    public String getImageUrl() {
        return imageUrl;
    }

    public void setImageUrl(String imageUrl) {
        this.imageUrl = imageUrl;
    }

    public static ThumbnailRequest fromJSON(String json) 
                               throws JsonProcessingException, IOException {
        ObjectMapper objectMapper=new ObjectMapper();
        return objectMapper.readValue(json, ThumbnailRequest.class);
    }
}

We will skip implementing the component for scaling the image. It’s not important for this demonstration.

Now we need to start the application and start sending and receiving messages. When everything is up and running, we can test the application by submitting a message like this on the thumbnail_requests:

{
    "objectId":"12345678abcdefg",
    "imageUrl":"s3://mybucket/images/image1.jpg"
}

To send a message, go to the SQS console and select the thumbnail_requests queue. Choose Queue Actions and then choose Send a Message. You should see something like this in thumbnail_results:

{
    "objectId":"12345678abcdefg",
    "imageUrl":"s3://mybucket/images/image1.jpg",
    "thumbnailUrl":"s3://mybucket/thumbnails/image1_thumbnail.jpg"
}

To see the message, go to the SQS console and select the thumbnail_results queue. Choose Queue Actions and then choose View/Delete Messages.

Exploring ASP.NET Core Part 2: Continuous Delivery

by Norm Johanson | on | in .NET | Permalink | Comments |  Share

The first post in this series discussed how to use an Amazon EC2 instance and AWS CodeDeploy to deploy ASP.NET Core applications from GitHub. The setup assumed all git pushes to GitHub were deployed to the running environment without validation. In this post, let’s examine how we can create an AWS environment for our ASP.NET Core application that gives us quality control and takes advantage of AWS cloud scale.

You’ll find the code and setup scripts for this post in the part2 branch of the aws-blog-net-exploring-aspnet-core repository.

Validating the Deployment

In the first post, the appspec.yml file called the InstallApp.ps1 script during the ApplicationStart hook to extract the application and set up IIS. To verify the application is up and running, let’s update the appspec.yml file to call ValidateInstall.ps1 during the ValidateService hook.

version: 0.0
os: windows
files:
  - source: 
    destination: C:ExploringAspNetCore
hooks:
  ApplicationStop:
    - location: .RemoveApp.ps1
      timeout: 30
  ApplicationStart:
    - location: .InstallApp.ps1
      timeout: 300
  ValidateService:
    - location: .ValidateInstall.ps1
      timeout: 300

This script allows us to call tests to make sure our application is running correctly. In the GitHub repository, I added some xunit tests under .SampleAppsrcSmokeTests. My sample application is no more than a simple "Hello World" web application so I just need to test that I can make a web call to the application and get a valid response. In a real-world application you would have a much more exhaustive suite of tests to run during this validation step.

Let’s take a look at ValidateInstall.ps1 to see how the tests are run.

sl C:ExploringAspNetCoreSampleAppsrcSmokeTests

# Restore the nuget references
& "C:Program Filesdotnetdotnet.exe" restore

# Run the smoke tests
& "C:Program Filesdotnetdotnet.exe" test

exit $LastExitCode

To run the tests, switch to the directory where the tests are stored. Restore the dependencies, and then run the dotnet test command. The failure of any test will cause dnx to return a non 0 exit code, which we return from the PowerShell script. AWS CodeDeploy will see the failed exit code and mark the deployment as a failure. We can get logs from the test run from the AWS CodeDeploy console to see what failed.

Using AWS CodePipeline

Now that deployments are running smoke tests, we can detect bad deployments. In a similar way, we want to make sure users aren’t affected by bad deployments. The best practice is to use a pipeline with a beta stage during which we run smoke tests. We promote to production only if beta was successful. This gives us continuous delivery with safety checks. Again, as we discussed in part 1, we benefit from the ability of ASP.NET Core to run from source. It means we do not have to bother configuring a build step in our pipeline. Pipelines can pull source from Amazon S3 or GitHub. To provide a complete sample that can be set up with just a PowerShell script, we’ll use S3 as the source for our pipeline. AWS CodePipeline will monitor S3 for new versions of an object to push them through the pipeline. For information about configuring a GitHub repository, see the AWS CodePipeline User Guide.

Setup Script

The PowerShell script .EnvironmentSetupEnvironmentSetup.ps1 in the repository will create the AWS resources required to deploy the application through a pipeline.

Note: To avoid charges for unused resources, be sure to run .EnvironmentSetupEnvironmentTearDown.ps1 when you are done testing.

The setup script sets up the following resources:

  • An S3 bucket with a zip of the archive as the initial deployment source.
  • A t2.small EC2 instance for beta.
  • An Auto Scaling group with a load balancer using t2.medium instances.
  • An AWS CodeDeploy application for beta using the t2.small EC2 instance.
  • An AWS CodeDeploy application for production using the Auto Scaling group.
  • AWS CodePipeline with the S3 bucket as the source and the beta and production stages configured to use the AWS CodeDeploy applications.

When the script is complete, it will print out the public DNS for the beta EC2 instance and production load balancer. We can monitor pipeline progress in the AWS CodePipeline console to see if the deployment was successful for both stages.

The application was deployed to both environments because the smoke tests were successful during the AWS CodeDeploy deployments.

Failed Deployments

Let’s see what happens when a deployment fails. We’ll force a test failure by opening the .SampleAppsrcSmokeTestsWebsiteTests.cs test file and making a change that will cause the test to fail.

[Fact]
public async Task PassingTest()
{
    using (var client = new HttpClient())
    {
        var response = await client.GetStringAsync("http://localhost-not-a-real-host/");

        Assert.Equal("Exploring ASP.NET Core with AWS.", response);
    }
}

In the repository, we can run the .DeployToPipeline.ps1 script, which will zip the archive and upload it to the S3 location used by the pipeline. This will kick off a deployment to beta. (The deployment will fail because of the bad test.)

A deployment will not be attempted during the production stage because of the failure at beta. This keeps production in a healthy state. To see what went wrong, we can view the deployment logs in the AWS CodeDeploy console.

Conclusion

With AWS CodeDeploy and AWS CodePipeline, we can build out a full continuous delivery system for deploying ASP.NET Core applications. Be sure to check out the GitHub repository for the sample and setup scripts. In the next post in this series, we’ll explore ASP.NET Core cross-platform support.

Introducing the Aws::Record Developer Preview

by Alex Wood | on | in Ruby | Permalink | Comments |  Share

We are happy to announce that the aws-record gem is now in Developer Preview and available for you to try.

What Is Aws::Record?

In version 1 of the AWS SDK for Ruby, the AWS::Record class provided a data mapping abstraction over Amazon DynamoDB operations. As version 2 of the AWS SDK for Ruby was being developed, many of you asked for an updated version of the library.

The aws-record gem provides a data mapping abstraction for DynamoDB built on top of the AWS SDK for Ruby version 2.

Using Aws::Record

You can download the aws-record gem from RubyGems by including the --pre flag in a gem installation:

gem install 'aws-record' --pre

You can also include it in your Gemfile. Do not include a version lock yet, so that bundler can find the pre-release version:

# Gemfile
gem 'aws-record'

Defining a Model

To create an aws-record model, include the Aws::Record module in your class definition:

require 'aws-record'

class Forum
  include Aws::Record
end

This will decorate your class with helper methods you can use to create a model compatible with DynamoDB’s table schemas. You might define keys for your table:

require 'aws-record'

class Forum
  include Aws::Record
  string_attr  :forum_uuid, hash_key: true
  integer_attr :post_id,    range_key: true
end

When you use these helper methods, you do not need to worry about how to define these attributes and types in DynamoDB. The helper methods and marshaler classes are able to define your table and item operations for you. The aws-record gem comes with predefined attribute types that cover a variety of potential use cases:

require 'aws-record'

class Forum
  include Aws::Record
  string_attr   :forum_uuid, hash_key: true
  integer_attr  :post_id,    range_key: true
  string_attr   :author_username
  string_attr   :post_title
  string_attr   :post_body
  datetime_attr :created_at
  map_attr      :post_metadata
end

Creating a DynamoDB Table

The aws-record gem provides a helper class for table operations, such as migrations. If we wanted to create a table for our Forum model in DynamoDB, we would run the following migration:

require 'forum' # Depending on where you defined the class above.

migration = Aws::Record::TableMigration.new(Forum)

migration.create!(
  provisioned_throughput: {
    read_capacity_units: 10,
    write_capacity_units: 4
  }
)

migration.wait_until_available # Blocks until table creation is complete.

Operations with DynamoDB Items

With a model and table defined, we can perform operations that relate to items in our table. Let’s create a post:

require 'forum'
require 'securerandom'

uuid = SecureRandom.uuid

post = Forum.new
post.forum_uuid = uuid
post.post_id = 1
post.author_username = "User1"
post.post_title = "Hello!"
post.post_body = "Hello Aws::Record"
post.created_at = Time.now
post.post_metadata = {
  this_is_a: "Post",
  types_supported_include: ["String", "Integer", "DateTime"],
  how_many_times_ive_done_this: 1
}

post.save # Writes to the database.

This example shows us some of the types that are supported and serialized for you. Using the key we’ve defined, we can also find this object in our table:

my_post = Forum.find(forum_uuid: uuid, post_id: 1)
my_post.post_title # => "Hello!"
my_post.created_at # => #<DateTime: 2016-02-09T14:39:07-08:00 ((2457428j,81547s,0n),-28800s,2299161j)>

You can use the same approach to save changes or, as shown here, you can delete the item from the table:

my_post.delete! # => true

At this point, we know how to use Aws::Record to perform key-value store operations powered by DynamoDB and have an introduction to the types available for use in our tables.

Querying, Scanning, and Collections

Because it is likely that you’re probably doing Query and Scan operations in addition to key-value operations, aws-record provides support for integrating them with your model class.

When you include the Aws::Record module, your model class is decorated with #query and #scan methods, which correspond to the AWS SDK for Ruby client operations. The response is wrapped in a collection enumerable for you. Consider the following basic scan operation:

Forum.scan # => #<Aws::Record::ItemCollection:0x007ffc293ec790 @search_method=:scan, @search_params={:table_name=>"Forum"}, @model=Forum, @client=#<Aws::DynamoDB::Client>>

No client call has been made yet: ItemCollection instances are lazy, and only make client calls only when needed. Because they provide an enumerable interface, you can use any of Ruby’s enumerable methods on your collection, and your result page is saved:

resp = Forum.scan
resp.take(1) # Makes a call to the underlying client. Returns a 'Forum' object.
resp.take(1) # Same result, but does not repeat the client call.

Because the Aws::Record::ItemCollection uses version 2 ofthe AWS SDK for Ruby, pagination support is built-in. So, if your operation requires multiple DynamoDB client calls due to response truncation, ItemCollection will handle the calls required in your enumeration:

def author_posts
  Forum.scan.inject({}) do |acc, post|
    author = post.author_username
    if acc[author]
      acc[author] += 1
    else
      acc[author] = 1
    end
    acc
  end
end

The same applies for queries. Your query result will also be provided as an enumerable ItemCollection:

def posts_by_forum(uuid)
  Forum.query(
    key_condition_expression: "#A = :a",
    expression_attribute_names: {
      "#A" => "forum_uuid"
    },
    expression_attribute_values: {
      ":a" => uuid
    }
  )
end

Given this functionality, you have the flexibility to mix and match Ruby’s enumerable functionality with DynamoDB filter expressions, for example, to curate your results. These two functions return the same set of responses:

def posts_by_author_in_forum(uuid, author)
  posts_by_forum(uuid).select do |post|
    post.author_username == author
  end
end

def posts_by_author_in_forum_with_filter(uuid, author)
  Forum.query(
    key_condition_expression: "#A = :a",
    filter_expression: "#B = :b",
    expression_attribute_names: {
      "#A" => "forum_uuid",
      "#B" => "author_username"
    },
    expression_attribute_values: {
      ":a" => uuid,
      ":b" => author
    }
  )
end

Support for Secondary Indexes

Aws::Record also supports both local and global secondary indexes. Consider this modified version of our Forum table:

require 'aws-record'

class IndexedForum
  include Aws::Record

  string_attr   :forum_uuid, hash_key: true
  integer_attr  :post_id,    range_key: true
  string_attr   :author_username
  string_attr   :post_title
  string_attr   :post_body
  datetime_attr :created_at
  map_attr      :post_metadata

  global_secondary_index(:author,
    hash_key: :author_username,
    projection: {
      projection_type: "INCLUDE",
      non_key_attributes: ["post_title"]
    }
  )

  local_secondary_index(:by_date,
    range_key: :created_at,
    projection: {
      projection_type: "ALL"
    }
  )
end

You can see the table’s attributes are the same, but we’ve included a couple potentially useful indexes.

  • :author: This uses the author name as a partition, which provides a way to search across forums by author user name without having to scan and filter. Take note of the projection, because your global secondary index results will only return the :forum_uuid, :post_id, :author_username, and :post_title. Other attributes will be missing from this projection, and you would have to hydrate your item by calling #reload! on the item instance.
  • :by_date: This provides a way to sort and search within a forum by post creation date.

To create this table with secondary indexes, you create a migration like we did before:

require 'indexed_forum'

migration = Aws::Record::TableMigration.new(IndexedForum)

migration.create!(
  provisioned_throughput: {
    read_capacity_units: 10,
    write_capacity_units: 4
  },
  global_secondary_index_throughput: {
    author: {
      read_capacity_units: 5,
      write_capacity_units: 3
    }
  }
)

migration.wait_until_available

You can use either of these indexes with the query interface:

require 'indexed_forum'

def search_by_author(author)
  IndexedForum.query(
    index_name: "author",
    key_condition_expression: "#A = :a",
    expression_attribute_names: {
      "#A" => "author_username"
    },
    expression_attribute_values: {
      ":a" => author
    }
  )
)

Secondary indexes can be a powerful performance tool, and aws-record can simplify the process of managing them.

Get Involved!

Please download the gem, give it a try, and let us know what you think. This project is a work in progress, so we welcome feature requests, bug reports, and information about the kinds of problems you’d like to solve by using this gem. And, as with other SDKs and tools we produce, we’d also be happy to look at contributions.

You can find the project on GitHub at https://github.com/awslabs/aws-sdk-ruby-record

Please reach out and let us know what you think!

Parallelizing Large Uploads for Speed and Reliability

by Magnus Bjorkman | on | in Java | Permalink | Comments |  Share

As Big Data grows in popularity, it becomes more important to move large data sets to and from Amazon S3. You can improve the speed of uploads by parallelizing them. You can break an individual file into multiple parts and upload those parts in parallel by setting the following in the AWS SDK for Java:

TransferManager tx = new TransferManager(
    			new AmazonS3Client(new DefaultAWSCredentialsProviderChain()),
    			Executors.newFixedThreadPool(5));
    	
TransferManagerConfiguration config=new TransferManagerConfiguration();
config.setMultipartUploadThreshold(5*1024*1024);
config.setMinimumUploadPartSize(5*1024*1024);
tx.setConfiguration(config);

There are a few things to note:

  • When we create the TransferManager, we give it an execution pool of 5 threads. By default, the TransferManager creates a pool of 10, but you can set this to scale the pool size.
  • The TransferManagerConfiguration allows us to set the limits used by the AWS SDK for Java to break large files into smaller parts.
  • MultipartUploadThreshold defines the size at which the AWS SDK for Java should start breaking apart the files (in this case, 5 MiB).
  • MinimumUploadPartSize defines the minimum size of each part. It must be at least 5 MiB; otherwise, you will get an error when you try to upload it.

In addition to improved upload speeds, an advantage to doing this is that your uploads will become more reliable, because if you have a failure, it will occur on a small part of the upload, rather than the entire upload. The retry logic built into the AWS SDK for Java will try to upload the part again. You can control the retry policy when you create the S3 client.

ClientConfiguration clientConfiguration=new ClientConfiguration();
clientConfiguration.setRetryPolicy(
    			PredefinedRetryPolicies.getDefaultRetryPolicyWithCustomMaxRetries(5));
TransferManager tx = new TransferManager(
    			new AmazonS3Client(new DefaultAWSCredentialsProviderChain(),clientConfiguration),
    			Executors.newFixedThreadPool(5));

Here we change the default setting of 3 retry attempts to 5. You can implement your own back-off strategies and define your own retry-able failures.

Although this is useful for a single file, especially a large one, people often have large collections of files. If those files are close in size to the multipart threshold, you need to submit multiple files to the TransferManager at the same time to get the benefits of parallelization. This requires a little more effort but is straightforward. First, we define a list of uploads.

objectList.add(new PutObjectRequest("mybucket", "folder/myfile1.dat",
	    			new File("/localfolder/myfile1.dat));
objectList.add(new PutObjectRequest("mybucket", "folder/myfile2.dat",
	    			new File("/localfolder/myfile2.dat));

Then we can submit the files for upload:

CountDownLatch doneSignal = new CountDownLatch(objectList.size());
ArrayList uploads = new ArrayList();
for (PutObjectRequest object: objectList) {
	object.setGeneralProgressListener(new UploadCompleteListener(object.getFile(),object.getBucketName()+"/"+object.getKey(),doneSignal));
	uploads.add(tx.upload(object));
}
try {
	doneSignal.await();
} catch ( InterruptedException e ) {
        throw new UploadException("Couldn't wait for all uploads to be finished");
}

The upload command is simple: just call the upload method on the TransferManager. That method is not blocking, so it will just schedule the upload and immediately return. To track progress and figure out when the uploads are complete:

  • We use a CountDownLatch, initializing it to the number of files to upload.
  • We register a general progress listener with each PutObjectRequest, so we can capture major events, including completion and failures that will count down the CountDownLatch.
  • After we have submitted all of the uploads, we use the CountDownLatch to wait for the uploads to complete.

The following is a simple implementation of the progress listener that allows us to track the uploads. It also contains some print statements to allow us to see what is happening when we test this:

class UploadCompleteListener implements ProgressListener
{
	
	private static Log log = LogFactory.getLog(UploadCompleteListener.class);
	
	CountDownLatch doneSignal;
	File f;
	String target;
	
	public UploadCompleteListener(File f,String target,
									CountDownLatch doneSignal) {
		this.f=f;
		this.target=target;
		this.doneSignal=doneSignal;
	}
	
	public void progressChanged(ProgressEvent progressEvent) {
		if (progressEvent.getEventType() 
				== ProgressEventType.TRANSFER_STARTED_EVENT) {
        	log.info("Started to upload: "+f.getAbsolutePath()
        		+ " -> "+this.target);
        }
        if (progressEvent.getEventType()
        		== ProgressEventType.TRANSFER_COMPLETED_EVENT) {
        	log.info("Completed upload: "+f.getAbsolutePath()
        		+ " -> "+this.target);
        	doneSignal.countDown();
        }
        if (progressEvent.getEventType() == 
        		ProgressEventType.TRANSFER_FAILED_EVENT) {
        	log.info("Failed upload: "+f.getAbsolutePath()
        		+ " -> "+this.target);
        	doneSignal.countDown();
        }
    }
}

After you have finished, don’t forget to shut down the transfer manager.

tx.shutdownNow();

Another great option for moving very large data sets is the AWS Import/Export Snowball service, a petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data into and out of the AWS cloud.

Using CMake Exports with the AWS SDK for C++

by Jonathan Henson | on | in C++, C++ | Permalink | Comments |  Share

This is our very first C++ blog post for the AWS Developer blog. There will be more to come. We are excited to receive and share feedback with the C++ community. This first post will start where most projects start, with the building of a simple program.

Building an application in C++ can be a daunting task—especially when dependencies are involved. Even after you have figured out what you want to do and which libraries you need to use, you encounter seemingly endless, painful tasks to compile, link, and distribute your application.

AWS SDK for C++ users most frequently report the difficulty of compiling and linking against the SDK. This involves building the SDK, installing the header files and libraries somewhere, updating the build system of the application with the include and linker paths, and passing definitions to the compiler. This is an error-prone– and now unnecessary– process. CMake has built-in functionality that will handle this scenario. We have now updated the CMake build scripts to handle this complexity for you.

The example we will use in this post assumes you are familiar with Amazon Simple Storage Service (Amazon S3), and know how to download and build the SDK. For more information, see our readme on github. If we want to write a simple program to upload and retrieve objects from Amazon S3. The code would look something like this:


#include <aws/s3/S3Client.h>
#include <aws/s3/model/PutObjectRequest.h>
#include <aws/s3/model/GetObjectRequest.h>
#include <aws/core/Aws.h>
#include <aws/core/utils/memory/stl/AWSStringStream.h> 

using namespace Aws::S3;
using namespace Aws::S3::Model;

static const char* KEY = "s3_cpp_sample_key";
static const char* BUCKET = "s3-cpp-sample-bucket";

int main()
{
    Aws::SDKOptions options;
    Aws::InitAPI(options);
	{
		S3Client client;

		//first put an object into s3
		PutObjectRequest putObjectRequest;
		putObjectRequest.WithKey(KEY)
			   .WithBucket(BUCKET);

		//this can be any arbitrary stream (e.g. fstream, stringstream etc...)
		auto requestStream = Aws::MakeShared<Aws::StringStream>("s3-sample");
		*requestStream << "Hello World!";

		//set the stream that will be put to s3
		putObjectRequest.SetBody(requestStream);

		auto putObjectOutcome = client.PutObject(putObjectRequest);

		if(putObjectOutcome.IsSuccess())
		{
			std::cout << "Put object succeeded" << std::endl;
		}
		else
		{
			std::cout << "Error while putting Object " << putObjectOutcome.GetError().GetExceptionName() << 
				   " " << putObjectOutcome.GetError().GetMessage() << std::endl;
		}

		//now get the object back out of s3. The response stream can be overridden here if you want it to go directly to 
		// a file. In this case the default string buf is exactly what we want.
		GetObjectRequest getObjectRequest;
		getObjectRequest.WithBucket(BUCKET)
			.WithKey(KEY);

		auto getObjectOutcome = client.GetObject(getObjectRequest);

		if(getObjectOutcome.IsSuccess())
		{
			std::cout << "Successfully retrieved object from s3 with value: " << std::endl;
			std::cout << getObjectOutcome.GetResult().GetBody().rdbuf() << std::endl << std::endl;;  
		}
		else
		{
			std::cout << "Error while getting object " << getObjectOutcome.GetError().GetExceptionName() <<
				 " " << getObjectOutcome.GetError().GetMessage() << std::endl;
		}
	}
    Aws::ShutdownAPI(options);
    return 0;  
}

Here, we have a direct dependency on aws-cpp-sdk-s3 and an indirect dependency on aws-cpp-sdk-core. Furthermore, we have several platform-specific dependencies that are required to make this work. On Windows, this involves WinHttp and BCrypt. On Linux, curl and OpenSSL. On OSX, curl and CommonCrypto. Other platforms, such as mobile, have their own dependencies. Traditionally, you would need to update your build system to detect each of these platforms and inject the right properties for each target.

However, the build process for the SDK already has access to this information from its configuration step. Why should you have to worry about this mess? Enter CMake export(). What would a CMakeLists.txt look like to build this program? This file generates our build artifacts for each platform we need to support—Visual Studio, XCode, AutoMake, and so on.


cmake_minimum_required(VERSION 2.8)
project(s3-sample)

#this will locate the aws sdk for c++ package so that we can use its targets
find_package(aws-sdk-cpp)

add_executable(s3-sample main.cpp)

#since we called find_package(), this will resolve all dependencies, header files, and cflags necessary
#to build and link your executable. 
target_link_libraries(s3-sample aws-cpp-sdk-s3)

That’s all you need to build your program. When we run this script for Visual Studio, CMake will determine that aws-cpp-sdk-s3 has dependencies on aws-cpp-sdk-core, WinHttp, and BCrypt. Also, the CMake configuration for the aws-sdk-cpp package knows whether the SDK was built using custom memory management. It will make sure the –DAWS_CUSTOM_MEMORY_MANAGEMENT flag is passed to your compiler if needed. The resulting Visual Studio projects will already have the include and linker arguments set and will contain compile definitions that need to be passed to your compiler. On GCC and Clang, we will also go ahead and pass you the –std=c++11 flag.

To configure your project, simply run the following:


cmake –Daws-sdk-cpp_DIR=<path to your SDK build> <path to your source>

You can pass additional CMake arguments , such as –G “Visual Studio 12 2013 Win64” too.

Now you are ready to build with msbuild, make, or whatever other build system you are using.

Obviously, not everyone uses or even wants to use CMake. The aws-sdk-cpp-config.cmake file will contain all of the information required to update your build script to use the SDK.

We’d like to extend a special thanks to our GitHub users for requesting this feature, especially Rico Huijbers who shared a blog post on the topic. His original post can be found here

We are excited to be offering better support to the C++ community. We invite you to try this feature and leave feedback here or on GitHub.

Exploring ASP.NET Core Part 1: Deploying from GitHub

by Norm Johanson | on | in .NET | Permalink | Comments |  Share

ASP.NET Core, formally ASP.NET 5, is a platform that offers lots of possibilities for deploying .NET applications. This series of posts will explore options for deploying ASP.NET applications on AWS.

What Is ASP.NET Core?

ASP.NET Core is the new open-source, cross-platform, and modularized implementation of ASP.NET. It is currently under development, so expect future posts to cover updates and changes (for example, the new CLI).

Deploying from GitHub

The AWS CodeDeploy deployment service can be configured to trigger deployments from GitHub. Before ASP.NET Core, .NET applications had to be built before they were deployed. ASP.NET Core applications can be deployed and run from the source.

Sample Code and Setup Scripts

The code and setup scripts for this blog can be found in the aws-blog-net-exploring-aspnet-core repository in the part1 branch.

Setting Up AWS CodeDeploy

AWS CodeDeploy automates deployments to Amazon EC2 instances that you set up and configure as a deployment group. For more information, see the AWS CodeDeploy User Guide.

Although ASP.NET Core offers cross-platform support, in this post we are using instances running Microsoft Windows Server 2012 R2. The Windows EC2 instances must have IIS, .NET Core SDK and the Windows Server Hosting installed. The Windows Server Hosting, also called the ASP.NET Core Module, is required to enable IIS to communicate with the ASP.NET Core web server, Kestrel.

To set up the AWS CodeDeploy environment, you can run the .\EnvironmentSetup\EnvironmentSetup.ps1 PowerShell script in the GitHub repository. This script will create an AWS CloudFormation stack that will set up an EC2 instance and configure AWS CodeDeploy and IIS with the .NET Core SDK and Windows Server Hosting. It will then set up an AWS CodeDeploy application called ExploringAspNetCorePart1.

To avoid ongoing charges for AWS resources, after you are done with your testing, be sure to run the .\EnvironmentSetup\EnvironmentTearDown.ps1 PowerShell script.

GitHub and AWS CodeDeploy

You can use the AWS CodeDeploy console to connect your AWS CodeDeploy application to a GitHub repository. Then you can initiate deployments to the AWS CodeDeploy application by specifying the GitHub repository and commit ID. The AWS CodeDeploy team has written a blog post that describes how to configure the repository to automatically push a deployment to the AWS CodeDeploy application.

Deploying from Source

When you deploy from GitHub, the deployment bundle is a zip archive of the repository. In the root of the repository is an appspec.yml file that tells AWS CodeDeploy how to deploy our application. For our application, the appspec.yml is very simple:

version: 0.0
os: windows
files:
  - source: 
    destination: C:\ExploringAspNetCore
hooks:
  ApplicationStop:
    - location: .\RemoveApp.ps1
      timeout: 30
  ApplicationStart:
    - location: .\InstallApp.ps1
      timeout: 300

The file tells AWS CodeDeploy to extract the files from our repository to C:\ExploringAspNetCore and then run the PowerShell script, InstallApp.ps1, to start the application. The script has three parts. The first part restores all the dependencies for the application.

# Restore the nuget references
"C:Program Files\dotnet\dotnet.exe" restore

The second part packages the application for publishing.

# Publish application with all of its dependencies and runtime for IIS to use
"C:Program Files\dotnet\dotnet.exe" publish --configuration release -o c:\ExploringAspNetCore\publish --runtime active

The third part updates IIS to point to the publishing folder. The AWS CodeDeploy agent is a 32-bit application and runs PowerShell scripts with the 32-bit version of PowerShell. To access IIS with PowerShell, we need to use the 64-bit version. That’s why this section passes the script into the 64-bit version of powershell.exe.

C:\Windows\SysNative\WindowsPowerShell\v1.0\powershell.exe -Command {
             Import-Module WebAdministration
             Set-ItemProperty 'IIS:sitesDefault Web Site' 
                 -Name physicalPath -Value c:\ExploringAspNetCore\publish
}

Note: This line was formatted for readability. For the correct syntax, view the script in the repository.

If we have configured the GitHub repository to push deployments to AWS CodeDeploy, then after every push, the code change will be zipped up and sent to AWS CodeDeploy. Then AWS CodeDeploy will execute the appspec.yml and InstallApp.ps1 and the EC2 instance will be up-to-date with the latest code — no build step required.

Share Your Feedback

Check out the aws-blog-net-exploring-aspnet-core repository and let us know what you think. We’ll keep adding ideas to the repository. Feel free to open an issue to share your own ideas for deploying ASP.NET Core applications.

Deploying Java Applications on Elastic Beanstalk from Maven

by Zhaoxi Zhang | on | in Java | Permalink | Comments |  Share

The Beanstalker open source project now supports Java SE application development and deployment directly to AWS Elastic Beanstalk using the Maven archetype elasticbeanstalk-javase-archetype. With just a few commands in a terminal, you can create and deploy a Java SE application. This blog post provides step-by-step instructions for using this archetype.

First, in the terminal, type the mvn archetype:generate command. Use elasticbeanstalk as the filter, choose elasticbeanstalk-javase-archetype as the target archetype, and set the required properties when prompted. This screenshot shows the execution of this command.

$ mvn archetype:generate -Dfilter=elasticbeanstalk
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building Maven Stub Project (No POM) 1
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] >>> maven-archetype-plugin:2.3:generate (default-cli) >
 generate-sources @ standalone-pom >>>
[INFO] 
[INFO] <<< maven-archetype-plugin:2.3:generate (default-cli) <
 generate-sources @ standalone-pom <<<
[INFO] 
[INFO] --- maven-archetype-plugin:2.3:generate (default-cli) @
 standalone-pom ---
[INFO] Generating project in Interactive mode
[INFO] No archetype defined. Using maven-archetype-quickstart
 (org.apache.maven.archetypes:maven-archetype-quickstart:1.0)
Choose archetype:
1: remote -> 
br.com.ingenieux:elasticbeanstalk-docker-dropwizard-webapp-archetype
 (A Maven Archetype for Publishing Dropwizard-based Services on AWS'
 Elastic Beanstalk Service)
2: remote -> 
br.com.ingenieux:elasticbeanstalk-javase-archetype
 (A Maven Archetype Encompassing Jetty for Publishing Java SE
 Services on AWS' Elastic Beanstalk Service)
3: remote -> 
br.com.ingenieux:elasticbeanstalk-service-webapp-archetype
 (A Maven Archetype Encompassing RestAssured, Jetty, Jackson, Guice
 and Jersey for Publishing JAX-RS-based Services on AWS' Elastic
 Beanstalk Service)
4: remote -> 
br.com.ingenieux:elasticbeanstalk-wrapper-webapp-archetype
 (A Maven Archetype Wrapping Existing war files on AWS' Elastic
 Beanstalk Service)
Choose a number or apply filter
 (format: [groupId:]artifactId, case sensitive contains): : 2
Choose br.com.ingenieux:elasticbeanstalk-javase-archetype version: 
1: 1.4.3-SNAPSHOT
2: 1.4.3-foralula
Choose a number: 2: 1
Define value for property 'groupId': : org.demo.foo
Define value for property 'artifactId': : jettyjavase
Define value for property 'version':  1.0-SNAPSHOT: : 
Define value for property 'package':  org.demo.foo: : 
Confirm properties configuration:
groupId: org.demo.foo
artifactId: jettyjavase
version: 1.0-SNAPSHOT
package: org.demo.foo
 Y: : 
[INFO] ------------------------------------------------------------------------
[INFO] Using following parameters for creating project from Archetype:
 elasticbeanstalk-javase-archetype:1.4.3-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] Parameter: groupId, Value: org.demo.foo
[INFO] Parameter: artifactId, Value: jettyjavase
[INFO] Parameter: version, Value: 1.0-SNAPSHOT
[INFO] Parameter: package, Value: org.demo.foo
[INFO] Parameter: packageInPathFormat, Value: org/demo/foo
[INFO] Parameter: package, Value: org.demo.foo
[INFO] Parameter: version, Value: 1.0-SNAPSHOT
[INFO] Parameter: groupId, Value: org.demo.foo
[INFO] Parameter: artifactId, Value: jettyjavase
[INFO] project created from Archetype in dir: /current/directory/jettyjavase
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 24:07 min
[INFO] Finished at: 2016-02-19T09:53:54+08:00
[INFO] Final Memory: 14M/211M
[INFO] ------------------------------------------------------------------------

A new folder, jettyjavase, will be created under your current working directory. If you go to this folder and use the tree command to explore the structure, you will see the following:

$ tree
.
├── Procfile
├── pom.xml
└── src
    └── main
        ├── assembly
        │   └── zip.xml
        ├── java
        │   └── org
        │       └── demo
        │           └── foo
        │               └── Application.java
        └── resources
            └── index.html

8 directories, 5 files

The archetype uses java-se-jetty-maven, the sample project provisioned by AWS Elastic Beanstalk, as the template. This sample project generates a single executable jar file. Instructions that tell the server how to run the jar file are included in the Procfile.

As described here, there are basically three ways to deploy a Java SE application to AWS Elastic Beanstalk:

  • deploying a single executable jar.
  • deploying a single zip source bundle file that contains multiple jars and a Procfile.
  • deploying a single zip source bundle file that contains the source code, as-is, and a Buildfile.

This archetype supports the second option only. Before you deploy this sample project to Elastic Beanstalk, you might want to check the default configurations are what you want. You’ll find the default configurations in the <properties> section of the pom.xml file. 

  • beanstalk.s3Bucket and beanstalk.s3Key properties. The default Amazon S3 bucket used to store the zip source bundle file is configured as ${project.groupId}(in this case, for example, org.demo.foo). If you don’t have this bucket available in your S3 account, use the AWS CLI to create one, as shown here. You can also replace the default setting of beanstalk.s3Bucket to your preferred bucket. You must have permission to upload to the specified bucket.

    $ aws s3 mb s3://org.demo.foo
    make_bucket: s3://org.demo.foo/
    
  • mainApplication property. This is the application entry used to generate the executable jar file.
  • beanstalk.mainJar property. This is the main application compiled from your project and configured in the Procfile on the first row.
  • beanstalk.sourceBundle property. This is the final zip source bundle file. This property should not be changed because it is derived from the mavan-assembly-plugin file-naming logic.
  • beanstalk.artifactFile property. This is the target artifact file to be uploaded to S3 bucket.
  • beanstalk.solutionStack property. This is the solution stack used by the Java SE platform. You can change the default value, but make sure it is one of the supported solution stacks.
  • beanstalk.environmentName property. This is the name of the environment to which your application will be deployed. The default value is the artifact ID.
  • beanstalk.cnamePrefix property. The default value for the CNAME prefix is the artifact ID. Reconfigure it if the default value is already in use.

This archetype uses the maven-assembly-plugin to create the uber jar file and the zip source bundle file. Go to src/main/assembly/zip.xml and make changes, as necessary, for files you want to include or exclude from the zip source bundle file. Update the Procfile if you want to run multiple applications on the server. Make sure the first line in the file starts with "web: ".

After you have finished the configuration, you are ready to deploy the sample project to AWS Elastic Beanstalk. One single command, mvn -Ps3-deploy package deploy, will do the rest of work. If the sample project is deployed successfully, you will be able to access the http://jettyjavase.us-east-1.elasticbeanstalk.com/ endpoint and should see the Congratulations page. The following are the excerpts from the command output.

$ mvn -Ps3-deploy package deploy
[INFO] Scanning for projects...
...
[INFO] --- maven-assembly-plugin:2.2-beta-5:single (package-jar) @
 jettyjavase ---
...
[WARNING] Replacing pre-existing project main-artifact file:
 /current/directory/jettyjavase/target/jettyjavase-1.0-SNAPSHOT.jar
with assembly file:
 /current/directory/jettyjavase/target/jettyjavase-1.0-SNAPSHOT.jar
...
[INFO] --- maven-assembly-plugin:2.2-beta-5:single (package-zip) @
 jettyjavase ---
[INFO] Reading assembly descriptor: src/main/assembly/zip.xml
[INFO] Building zip:
 /current/directory/jettyjavase/target/jettyjavase-1.0-SNAPSHOT.zip
...
[INFO] --- beanstalk-maven-plugin:1.4.2:upload-source-bundle (deploy) @
 jettyjavase ---
[INFO] Target Path:
 s3://org.demo.foo/jettyjavase-1.0-SNAPSHOT-20160219040404.zip
[INFO] Uploading artifact file:
 /current/directory/jettyjavase/target/jettyjavase-1.0-SNAPSHOT.zip
  100.00% 945 KiB/945 KiB                        Done
[INFO] Artifact Uploaded
[INFO] SUCCESS
[INFO] null/void result
[INFO] 
[INFO] --- beanstalk-maven-plugin:1.4.2:create-application-version (deploy) @
 jettyjavase ---
[INFO] SUCCESS
[INFO] {
[INFO]   "applicationName" : "jettyjavase",
[INFO]   "description" : "Update from beanstalk-maven-plugin",
[INFO]   "versionLabel" : "20160219040404",
[INFO]   "sourceBundle" : {
[INFO]     "s3Bucket" : "org.demo.foo",
[INFO]     "s3Key" : "jettyjavase-1.0-SNAPSHOT-20160219040404.zip"
[INFO]   },
[INFO]   "dateCreated" : 1455854658750,
[INFO]   "dateUpdated" : 1455854658750
[INFO] }
[INFO] 
[INFO] --- beanstalk-maven-plugin:1.4.2:put-environment (deploy) @
 jettyjavase ---
[INFO] ... with cname set to 'jettyjavase.elasticbeanstalk.com'
[INFO] ... with status *NOT* set to 'Terminated'
[INFO] Environment Lookup
[INFO] ... with environmentId equal to 'e-pa2mn9iqkw'
[INFO] ... with status   set to 'Ready'
[INFO] ... with health equal to 'Green'
...
[INFO] SUCCESS
[INFO] {
[INFO]   "environmentName" : "jettyjavase",
[INFO]   "environmentId" : "e-pa2mn9iqkw",
[INFO]   "applicationName" : "jettyjavase",
[INFO]   "versionLabel" : "20160219040404",
[INFO]   "solutionStackName" : "64bit Amazon Linux 2015.09 v2.0.4 running Java 7",
[INFO]   "description" : "Java Sample Jetty App",
[INFO]   "dateCreated" : 1455854662359,
[INFO]   "dateUpdated" : 1455854662359,
[INFO]   "status" : "Launching",
[INFO]   "health" : "Grey",
[INFO]   "tier" : {
[INFO]     "name" : "WebServer",
[INFO]     "type" : "Standard",
[INFO]     "version" : " "
[INFO]   },
[INFO]   "cname" : "jettyjavase.us-east-1.elasticbeanstalk.com"
[INFO] }
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 05:14 min
[INFO] Finished at: 2016-02-19T12:09:19+08:00
[INFO] Final Memory: 26M/229M
[INFO] ------------------------------------------------------------------------

As you can see from the output and the pom.xml file, the s3-deploy profile consecutively executs three mojo commands, upload-source-bundle, create-application-version, and put-environment.

We welcome your feedback on the use of this Maven archetype. We want to add more features and continuously improve the user experience.

Event-driven architecture using Scala, Docker, Amazon Kinesis Firehose, and the AWS SDK for Java (Part 2)

by Sascha Moellering | on | in Java | Permalink | Comments |  Share

In the first part of this blog post, we used the AWS SDK for Java to create a Scala application to write data in Amazon Kinesis Firehose, Dockerized the application, and then tested and verified the application is working. Now we will roll out our Scala application in Amazon EC2 Container Service (ECS) and use the Amazon EC2 Container Registry (Amazon ECR) as our private Docker registry.

To roll out our application on Amazon ECS, we have to set up a private Docker registry and an Amazon ECS cluster. First, we have to create IAM roles for Amazon ECS. Before we can launch container instances and register them into a cluster, we must generate an IAM role for those container instances to use when they are launched. This requirement applies to container instances launched with the Amazon Machine Image (AMIoptimized for ECS or any other instances where you will run the agent.

aws iam create-role --role-name ecsInstanceRole --assume-role-policy-document file://<path_to_json_file>/ecsInstanceRole.json

ecsInstanceRole.json:

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

 

aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role --role-name ecsInstanceRole

We have to attach an additional policy so the ecsInstanceRole can pull Docker images from Amazon ECR:

aws iam put-role-policy --role-name ecsInstanceRole --policy-name ecrPullPolicy --policy-document file://<path_to_json_file>/ecrPullPolicy.json

ecrPullPolicy.json:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ecr:BatchGetImage",
                "ecr:GetDownloadUrlForLayer",
                "ecr:GetAuthorizationToken"
            ],
            "Resource": "*"
        }
    ]
}

This role needs permission to write into our Amazon Kinesis Firehose stream, too:

aws iam put-role-policy --role-name ecsInstanceRole --policy-name firehosePolicy --policy-document file://<path_to_json_file>/firehosePolicy.json

firehosePolicy.json:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Action": [
                "firehose:DescribeDeliveryStream",
                "firehose:ListDeliveryStreams",
                "firehose:PutRecord",
                "firehose:PutRecordBatch"
            ],
            "Resource": [
                "arn:aws:firehose:aws-region:<account-ID>:deliverystream/<delivery-stream-name>"
            ]
        }
    ]
}

The Amazon ECS service scheduler makes calls on our behalf to the Amazon EC2 and Elastic Load Balancing APIs to register and deregister container instances with our load balancers. Before we can attach a load balancer to an Amazon ECS service, we must create an IAM role for our services to use. This requirement applies to any Amazon ECS service that we plan to use with a load balancer.

aws iam create-role --role-name ecsServiceRole --assume-role-policy-document file://<path_to_json_file>/ecsServiceRole.json

ecsServiceRole.json:

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "ecs.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

 

aws iam put-role-policy --role-name ecsServiceRole --policy-name ecsServicePolicy --policy-document file://<path_to_json_file>/ecsServicePolicy.json

ecsServicePolicy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "elasticloadbalancing:Describe*",
        "elasticloadbalancing:DeregisterInstancesFromLoadBalancer",
        "elasticloadbalancing:RegisterInstancesWithLoadBalancer",
        "ec2:Describe*",
        "ec2:AuthorizeSecurityGroupIngress"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}

Now we have set up the IAM roles and permissions required for a fully functional Amazon ECS cluster. Before setting up the cluster, we will create an ELB load balancer to be used for our akka-firehose service. The load balancer is called “akkaElb.It maps port 80 to port 80, and uses the specified subnets and security groups.

aws elb create-load-balancer --load-balancer-name akkaElb --listeners "Protocol=HTTP,LoadBalancerPort=80,InstanceProtocol=HTTP,InstancePort=80" --subnets subnet-a,subnet-b --security-groups sg-a --region us-east-1

The health check configuration of the load balancer contains information such as the protocol, ping port, ping path, response timeout, and health check interval.

aws elb configure-health-check --load-balancer-name akkaElb --health-check Target=HTTP:80/api/healthcheck,Interval=30,UnhealthyThreshold=5,HealthyThreshold=2,Timeout=3 --region us-east-1

We should enable connection draining for our load balancer to ensure it will stop sending requests to deregistering or unhealthy instances, while keeping existing connections open.

aws elb modify-load-balancer-attributes --load-balancer-name akkaElb --load-balancer-attributes "{"ConnectionDraining":{"Enabled":true,"Timeout":300}}" --region us-east-1

After setting up the load balancer, we can now create the Amazon ECS cluster:

aws ecs create-cluster --cluster-name "AkkaCluster" --region us-east-1

To set up our Amazon ECR repository correctly, we will sign in to Amazon ECR and receive a token that will be stored in /home/ec2-user/.docker/config.jsonThe token is valid for 24 hours.

aws ecr get-login --region us-east-1

To store the Docker image, we will create a repository in Amazon ECR:

aws ecr create-repository --repository-name akka-firehose --region us-east-1

Under most circumstances, the Docker image will be created at the end of a build process triggered by a continuous integration server like Jenkins. Therefore, we have to create a repository policy so that the Jenkins IAM role can push and pull Docker images from our newly created repository:

aws ecr set-repository-policy --repository-name akka-firehose --region us-east-1 --policy-text "{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "jenkins_push_pull",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<account-id>:role/<Jenkins-role>"
            },
            "Action": [
                "ecr:DescribeRepositories",
                "ecr:GetRepositoryPolicy",
                "ecr:BatchCheckLayerAvailability",
                "ecr:GetDownloadUrlForLayer",
                "ecr:ListImages",
                "ecr:BatchGetImage",
                "ecr:PutImage",
                "ecr:InitiateLayerUpload",
                "ecr:UploadLayerPart",
                "ecr:CompleteLayerUpload"
            ]
        }
    ]
}"

We have to add a similar repository policy for Amazon ECS because the Amazon EC2 instances in our Amazon ECS cluster have to be able to pull Docker images from our private Docker registry:

aws ecr set-repository-policy --repository-name akka-firehose --region us-east-1 --policy-text "{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "ecs_instance_pull",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<account-id>:role/ecsInstanceRole"
            },
            "Action": [
                "ecr:DescribeRepositories",
                "ecr:GetRepositoryPolicy",
                "ecr:BatchCheckLayerAvailability",
                "ecr:GetDownloadUrlForLayer",
                "ecr:ListImages",
                "ecr:BatchGetImage"
            ]
        }
    ]
}"

Now we can tag and push the Docker container into our repository in Amazon ECR:

docker tag akka-firehose <account-id>.dkr.ecr.us-east-1.amazonaws.com/akka-firehose


docker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/akka-firehose

To populate our Amazon ECS cluster, we have to launch a few Amazon EC2 instances and register them in our cluster. It is important to choose either the Amazon Machine Image optimized for ECR or one for another operating system (such as CoreOS or Ubuntu) with the Amazon ECS container agent installed. (In this example, we will use the ECS-optimized AMI of Amazon Linux.) In the first step, we will create an instance profile and then attach the ecsInstanceRole to this profile:

aws iam create-instance-profile --instance-profile-name ecsServer


aws iam add-role-to-instance-profile --role-name ecsInstanceRole --instance-profile-name ecsServer

Now we will use the following user data script to launch a few EC2 instances in different subnets:

ecs-userdata.txt:

#!/bin/bash
yum update -y
echo ECS_CLUSTER=AkkaCluster >> /etc/ecs/ecs.config

This user data script updates the Linux packages of the Amazon EC2 instance and registers it in the Amazon ECS cluster. By default, the container instance is launched into your default cluster if you don’t specify another one.

aws ec2 run-instances --image-id ami-840e42ee --count 1 --instance-type t2.medium --key-name <your_ssh_key> --security-group-ids sg-a --subnet-id subnet-a --iam-instance-profile Name=ecsServer --user-data file://<path_to_user_data_file>/ecs-userdata.txt --region us-east-1


aws ec2 run-instances --image-id ami-840e42ee --count 1 --instance-type t2.medium --key-name <your_ssh_key> --security-group-ids sg-a --subnet-id subnet-b --iam-instance-profile Name=ecsServer --user-data file://<path_to_user_data_file>/ecs-userdata.txt --region us-east-1

Now we will register our task definition and service:

aws ecs register-task-definition --cli-input-json file://<path_to_json_file>/akka-firehose.json --region us-east-1

akka-firehose.json:

{
  "containerDefinitions": [
    {
      "name": "akka-firehose",
      "image": "<your_account_id>.dkr.ecr.us-east-1.amazonaws.com/akka-firehose",
      "cpu": 1024,
      "memory": 1024,
      "portMappings": [{
                      "containerPort": 8080,
                      "hostPort": 80
              }],
      "essential": true
    }
  ],
  "family": "akka-demo"
}

The task definition specifies which image you want to use, how many resources (CPU and RAM) are required and the port mappings between Docker container and host.

aws ecs create-service --cluster AkkaCluster --service-name akka-firehose-service --cli-input-json file://<path_to_json_file>/akka-elb.json --region us-east-1

akka-elb.json:

{
    "serviceName": "akka-firehose-service",
    "taskDefinition": "akka-demo:1",
    "loadBalancers": [
        {
            "loadBalancerName": "akkaElb",
            "containerName": "akka-firehose",
            "containerPort": 8080
        }
    ],
    "desiredCount": 2,
    "role": "ecsServiceRole"
}

Our service uses the task definition in version 1 and connects the containers on port 8080 to our previously defined ELB load balancer. The configuration specifies the desired number of services to two, so if we have registered two Amazon EC2 instances in our Amazon ECS cluster, each of them should run a service. After a short amount of time, the service should run successfully on the cluster. We can test the current setup by sending a POST request to our ELB load balancer:

curl -v -H "Content-type: application/json" -X POST -d '{"userId":100, "userName": "This is user data"}' http://<address_of_elb>.us-east-1.elb.amazonaws.com/api/user

After sending data to our application, we can list the files in the S3 bucket we created as a target for Amazon Kinesis Firehose:

aws s3 ls s3://<your_name>-firehose-target --recursive

In this blog post we created the infrastructure to roll out our Scala-based microservice in Amazon ECS and Amazon ECR. We hope we’ve given you ideas for creating your own Dockerized Scala-based applications in AWS. Feel free to share your ideas in the comments below! 

Event-driven architecture using Scala, Docker, Amazon Kinesis Firehose, and the AWS SDK for Java (Part 1)

by Sascha Moellering | on | in Java | Permalink | Comments |  Share

The key to developing a highly scalable architecture is to decouple functional parts of an application. In the context of an event-driven architecture, those functional parts are single-purpose event processing components (“microservices”). In this blog post, we will show you how to build a microservice using Scala, Akka, Scalatra, the AWS SDK for Java, and Docker. The application uses the AWS SDK for Java to write data into Amazon Kinesis Firehose. It can capture and automatically load streaming data into Amazon S3 and Amazon Redshift. Amazon S3 will be the target of the data we will put into a Firehose delivery stream.

In a two-part series, this blog post will cover the following topics:

Part 1: How to use the AWS SDK for Java to get started with Scala development, how to set up Amazon Kinesis Firehose, and how to test your application locally.

Part 2: How to use Amazon EC2 Container Service (Amazon ECS) and Amazon EC2 Container Registry (Amazon ECR) to roll out your Dockerized Scala application.

After you have downloaded your IDE, set up your AWS account, created an IAM user, and installed the AWS CLI, you can check out the example application from https://github.com/awslabs/aws-akka-firehose.

Accessing Java classes from Scala is no problem, but Scala has language features which can’t be applied to Java directly (for example, function types and traits) that can’t be applied to Java directly. The core of this application is an Akka actor that writes JSON data into Amazon Kinesis Firehose. Akka implements the actor model which is a model of concurrent programming. Actors receive messages and take actions based on those messages. With Akka, it is easy to build a distributed system using remote actors. In this example, the FirehoseActor receives a message from a REST interface that is written with Scalatra, a small and efficient, Sinatra-like web framework for Scala. It implements the servlet specification, so Scalatra apps can be deployed in Tomcat, Jetty or other servlet engines, or JavaEE application servers. To reduce dependencies and complexity, the application uses an embedded Jetty servlet engine that is bundled with the application. To bundle Jetty with our application, we have to add Jetty as dependency in build.scala:

"org.eclipse.jetty" % "jetty-webapp" % "9.2.10.v20150310" % "container;compile",

In this example, we use sbt and the sbt-assembly-plugin that was inspired by Maven’s assembly plugin to build a fat JAR containing all dependencies. We have to add sbt-assembly as a dependency in project/assembly.sbt and specify the main class in build.scala:

.settings(mainClass in assembly := Some("JettyLauncher"))

In this case, the main class is called JettyLauncher. It is responsible for bootstrapping the embedded Jetty servlet engine.

def main(args: Array[String]) {
    val port = if (System.getenv("PORT")!= null) System.getenv("PORT").toInt else 8080

    val server = new Server()
    val connector = new ServerConnector(server)
    connector.setHost("0.0.0.0");
    connector.setPort(port);
    server.addConnector(connector);

    val context = new WebAppContext()
    context setContextPath "/"
    context.setResourceBase("src/main/webapp")
    context.addEventListener(new ScalatraListener)
    context.addServlet(classOf[DefaultServlet], "/")
    server.setHandler(context)
    server.start
    server.join
  }

The ScalatraBootstrap file initializes the actor-system and mounts the FirehoseActorApp servlet under the context /api/:

class ScalatraBootstrap extends LifeCycle {
  val system = ActorSystem()
  val myActor = system.actorOf(Props[FireHoseActor])

  override def init(context: ServletContext) {
    context.mount(new FirehoseActorApp(system, myActor), "/api/*")
  }
}

The servlet exposes a REST API that accepts POST requests to /user with the parameters userId, userName, and timestamp. This API maps the passed values into a UserMessage object and sends this object as a message to the FireHoseActor.

class FirehoseActorApp(system: ActorSystem, firehoseActor: ActorRef) extends ScalatraServlet with JacksonJsonSupport {
  protected implicit lazy val jsonFormats: Formats = DefaultFormats
  implicit val timeout = new Timeout(2, TimeUnit.SECONDS)
  protected implicit def executor: ExecutionContext = system.dispatcher

  post("/user") {
    val userMessage = parsedBody.extract[UserMessage]
    firehoseActor ! userMessage
    Ok()
  }
}

This FirehoseActor uses the AWS SDK for Java to create an Amazon Kinesis Firehose client and send received messages asychronously to a Firehose stream:

def createFireHoseClient(): AmazonKinesisFirehoseAsyncClient = {
    log.debug("Connect to Firehose Stream: " + streamName)
    val client = new AmazonKinesisFirehoseAsyncClient
    val currentRegion = if (Regions.getCurrentRegion != null) Regions.getCurrentRegion else Region.getRegion(Regions.EU_WEST_1)
    client.withRegion(currentRegion)
    return client
  }


def sendMessageToFirehose(payload: ByteBuffer, partitionKey: String): Unit = {
   val putRecordRequest: PutRecordRequest = new PutRecordRequest
   putRecordRequest.setDeliveryStreamName(streamName)
   val record: Record = new Record
   record.setData(payload)
   putRecordRequest.setRecord(record)

   val futureResult: Future[PutRecordResult] = firehoseClient.putRecordAsync(putRecordRequest)

   try {
     val recordResult: PutRecordResult = futureResult.get
     log.debug("Sent message to Kinesis Firehose: " + recordResult.toString)
   }

   catch {
     case iexc: InterruptedException => {
       log.error(iexc.getMessage)
     }

     case eexc: ExecutionException => {
       log.error(eexc.getMessage)
     }
   }
 }

Using sbt to build the application is easy: The command sbt assembly compiles and builds a fat JAR containing all required libraries.

Now we should focus on setting up the infrastructure used by the application. First, we create the S3 bucket:

aws s3 mb --region us-east-1 s3://<your_name>-firehose-target --output json

Second, we create an IAM role to permit access to Firehose:

aws iam create-role --query "Role.Arn" --output json 
    --role-name FirehoseDefaultDeliveryRole 
    --assume-role-policy-document "{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "PermitFirehoseAccess",
            "Effect": "Allow",
            "Principal": {
                "Service": "firehose.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}"

Now, we need to create an IAM policy in order to get access to the S3 bucket:

aws iam put-role-policy 
    --role-name FirehoseDefaultDeliveryRole 
    --policy-name FirehoseDefaultDeliveryPolicy 
    --policy-document "{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "PermitFirehoseUsage",
            "Effect": "Allow",
            "Action": [
                "s3:AbortMultipartUpload",
                "s3:GetBucketLocation",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:ListBucketMultipartUploads",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::<your_name>-firehose-target",
                "arn:aws:s3:::<your_name>-firehose-target/*"
            ]
        }
    ]
}"

The last step is to create the Firehose stream:

aws firehose create-delivery-stream --region eu-west-1 --query "DeliveryStreamARN" --output json 
    --delivery-stream-name firehose_stream 
    --s3-destination-configuration "RoleARN=<role_arn>,BucketARN=arn:aws:s3:::<your_name>-firehose-target"

To roll out the application in Amazon ECS, we need to build a Docker image containing the fat JAR and a JRE:

FROM phusion/baseimage

# Install Java.
RUN 
  echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | debconf-set-selections && 
  add-apt-repository -y ppa:webupd8team/java && 
  apt-get update && 
  apt-get install -y oracle-java8-installer && 
  rm -rf /var/lib/apt/lists/* && 
  rm -rf /var/cache/oracle-jdk8-installer

WORKDIR /srv/jetty

# Define commonly used JAVA_HOME variable
ENV JAVA_HOME /usr/lib/jvm/java-8-oracle

ADD target/scala-2.11/akka-firehose-assembly-*.jar srv/jetty/
CMD java -server 
   -XX:+DoEscapeAnalysis  
   -XX:+UseStringDeduplication -XX:+UseCompressedOops 
   -XX:+UseG1GC -jar srv/jetty/akka-firehose-assembly-*.jar

This Dockerfile it based on phusion-baseimage and installs Oracle’s JDK 8. It also sets the JAVA_HOME variable, copies the fat JAR to /srv/jetty, and starts using java -jar. Building the Docker image is pretty straightforward:

docker build -t smoell/akka-firehose .

There are two options for testing the application: by using the JAR file directly or by using the Docker container. To start the application by using the JAR file:

java -jar target/scala-2.11/akka-firehose-assembly-0.1.0.jar

With the following curl command, we can post data to our application to send data to our Firehose stream:

curl -v -H "Content-type: application/json" -X POST -d '{"userId":100, "userName": "This is user data"}' http://127.0.0.1:8080/api/user

The test looks a little bit different when we use the Docker container. First, we have to start the Firehose container and pass the access_key and secret_access_key as environment variables. (This is not necessary if we run on an EC2 instance, because the AWS SDK for Java uses the instance metadata.):

docker run --dns=8.8.8.8 --env AWS_ACCESS_KEY_ID="<your_access_key>" --env AWS_SECRET_ACCESS_KEY="<your_secret_access_key>" -p 8080:8080 smoell/akka-firehose

The curl command looks a little bit different this time, because we have to replace 127.0.0.1 with the IP address Docker is using on our local machine:

curl -v -H "Content-type: application/json" -X POST -d '{"userId":100, "userName": "This is user data"}' http://<you_docker_ip>:8080/api/user

After sending data to our application, we can list the files in the S3 bucket we’ve created as a target for Amazon Kinesis Firehose:

aws s3 ls s3://<your_name>-firehose-target --recursive

In this blog post, we used the AWS SDK for Java to create a Scala application to write data in Amazon Kinesis Firehose, Dockerized the application, and then testen and verified the application is working. In the second part of this blog post, we will roll out our application in Amazon ECS by using Amazon ECR as our private Docker registry.

Managing Dependencies in Gradle with AWS SDK for Java – Bill of Materials module (BOM)

by Manikandan Subramanian | on | in Java | Permalink | Comments |  Share

In an earlier blog post, I discussed how a Maven bill of materials (BOM) module can be used to manage your Maven dependencies on the AWS SDK for Java.

In this blog post, I will provide an example of how you can use the Maven BOM in your Gradle projects to manage the dependencies on the SDK. I will use an open source Gradle dependency management plugin from Spring to import a BOM and then use its dependency management.

Here is the build.gradle snippet to apply the dependency management plugin to the project:

buildscript {
    repositories {
        mavenCentral()
    }
    dependencies {
        classpath "io.spring.gradle:dependency-management-plugin:0.5.4.RELEASE"
    }
}

apply plugin: "io.spring.dependency-management"

Now, import the Maven BOM into the dependencyManagement section and specify the SDK modules in the dependencies section, as shown here:

dependencyManagement {
    imports {
        mavenBom 'com.amazonaws:aws-java-sdk-bom:1.10.47'
    }
}

dependencies {
    compile 'com.amazonaws:aws-java-sdk-s3'
    testCompile group: 'junit', name: 'junit', version: '4.11'
}

Gradle resolves the aws-java-sdk-s3 module to the version specified in the BOM, as shown in the following dependency resolution diagram.

 

Have you been using the AWS SDK for Java in Gradle? If so, please leave us your feedback in the comments.