AWS Developer Blog

Release: AWS SDK for PHP 2.4.2

by Jeremy Lindblom | on | in PHP | Permalink | Comments |  Share

We would like to announce the release of version 2.4.2 of the AWS SDK for PHP. This release adds support for custom Amazon Machine Images (AMIs) and Chef 11 to the AWS OpsWorks client, adds the latest snapshot permission features to the Amazon Redshift client, and updates the Amazon EC2 and AWS Security Token Service clients.

Changelog

  • Added support for cross-account snapshot access control to the Amazon Redshift client
  • Added support for decoding authorization messages to the AWS STS client
  • Added support for checking for required permissions via the DryRun parameter to the Amazon EC2 client
  • Added support for custom Amazon Machine Images (AMIs) and Chef 11 to the AWS OpsWorks client
  • Added an SDK compatibility test to allow users to quickly determine if their system meets the requirements of the SDK
  • Updated the Amazon EC2 client to use the 2013-06-15 API version
  • Fixed an unmarshalling error with the Amazon EC2 CreateKeyPair operation
  • Fixed an unmarshalling error with the Amazon S3 ListMultipartUploads operation
  • Fixed an issue with the Amazon S3 stream wrapper "x" fopen mode
  • Fixed an issue with AwsS3S3Client::downloadBucket by removing leading slashes from the passed $keyPrefix argument

Install/Download the Latest SDK

Closeable S3Objects

by Jason Fulghum | on | in Java | Permalink | Comments |  Share

The com.amazonaws.services.s3.model.S3Object class now implements the Closeable interface (AWS SDK for Java 1.4.8 onwards). This allows you to use it as a resource in a try-with-resources statement. S3Object contains an S3ObjectInputStream that lets you stream down your data over the HTTP connection from Amazon S3. Since the HTTP connection is open and waiting, it’s important to read the stream quickly after calling getObject and to remember to close the stream so that the HTTP connection can be released properly. With the new Closeable interface, it’s even easier to ensure that you’re properly handling those HTTP connection resources.

The following snippet demonstrates how simple it is to use S3Object with a try-with-resources statement.

try (S3Object object = s3.getObject(bucket, key)) {
    System.out.println("key: " + object.getKey());
    System.out.println("data: " + dumpStream(object.getObjectContent());
} catch (Exception e) {
    System.out.println("Unable to download object from Amazon S3: " + e);
}

AWS SDK for Ruby v1.14.0

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

We just published v1.14.0 of the AWS SDK for Ruby (aws-sdk gem).  This release updates the SDK to support custom Amazon Machine Images (AMIs) and Chef 11 for AWS OpsWorks. Also updates Amazon Simple Workflow Service and Amazon Simple Notifications Service to latest API versions.

You can view the release notes here.

AWS.Extensions renaming

by Pavel Safronov | on | in .NET | Permalink | Comments |  Share

Earlier this week, you may have noticed that the assembly AWS.Extensions—which contained DynamoDBSessionStateStore—has been renamed to AWS.SessionProvider. Our original intent with AWS.Extensions was to create a place for SDK extensions, which aren’t strictly part of the AWS SDK for .NET. We have since developed another extension, DynamoDBTraceListener, a TraceListener that allows the logging of trace and debug output to Amazon DynamoDB.

Unfortunately, the two extensions have distinct requirements: DynamoDBSessionStateStore references System.Web, and thus cannot be used in a Client Profile, while DynamoDBTraceListener does not. So to avoid requiring customers to reference unnecessary assemblies, we’ve decided to separate the AWS.Extensions project into multiple task-oriented solutions. Thus, customers who only require DynamoDB logging will not have to import the server assemblies required by the session provider.

Migration

If you are referencing AWS.Extensions.dll in your project, simply change the reference to AWS.SessionProvider.dll. There are no code changes to be made for this.

NuGet users should remove the reference to AWS.Extensions and instead use the new AWS.SessionProvider package. The existing AWS.Extensions package is now marked OBSOLETE.

DynamoDBTraceListener

by Pavel Safronov | on | in .NET | Permalink | Comments |  Share

We recently introduced the DynamoDBTraceListener, a System.Diagnostics TraceListener that can be used to log events straight to Amazon DynamoDB. In this post, we show how simple it is to configure the listener and how to customize the data that is being logged.

Configuration

You can configure the listener either through code or by using a config file. (For console applications, this will be app.config, while IIS projects will use web.config.) Here is a sample configuration that lists a few of the possible configuration parameters:

<system.diagnostics>
  <trace autoflush="true">
    <listeners>
      <add name="dynamo" type="Amazon.TraceListener.DynamoDBTraceListener, AWS.TraceListener"
                      Region="us-west-2"
                      ExcludeAttributes="Callstack"
                      HashKeyFormat="%ComputerName%-{EventType}-{ProcessId}"
                      RangeKeyFormat="{Time}"
        />
    </listeners>
  </trace>    
</system.diagnostics>  

Web.config parameters

Here are all the possible parameters you can define in the config file, their meanings and defaults:

  • AWSAccessKey : Access key to use.
  • AWSSecretKey : Secret key to use. The access and secret keys can be set either in the listener definition or in the appSettings section. If running on an EC2 instance with a role, the listener can use the instance credentials. When specifying these, consider using an IAM user with a restricted policy like the example at the bottom of this post.
  • Region : Region to use DynamoDB in. The default is "us-west-2".
  • Table : Table to log to. The default is "Logs".
  • CreateIfNotExist : Controls whether the table will be auto created if it doesn’t exist. The default is true. If this flag is set to false and the table doesn’t exist, an exception is thrown.
  • ReadCapacityUnits : Read capacity units if the table is not yet created. The default is 1.
  • WriteCapacityUnits : Write capacity units if the table is not yet created. The default is 10.
  • HashKey : Name of the hash key if the table is not yet created. The default is "Origin".
  • RangeKey : Name of the range key if the table is not yet created. The default is "Timestamp".
  • MaxLength : Maximum length of any single attribute. The default is 10,000 characters ("10000").
  • ExcludeAttributes : Comma-separated list of attributes that should not be logged. The default is null – all possible attributes are logged.
  • HashKeyFormat : Format of the hash-key for each logged item. Default format is "{Host}". See format description below.
  • RangeKeyFormat : Format of the range-key for each logged item. Default format is "{Time}. See format description below.
  • WritePeriodMs : Frequency of writes to DynamoDB, in milliseconds. The listener will accumulate logs in a local file until this time has elapsed. The default is one minute ("60000").
  • LogFilesDir : Directory to write temporary logs to. If you don’t specify a directory, the listener attempts to use the current directory, then the temporary directory. If neither is available for writing, the listener will be disabled.

Hash/range key formats

As you’ve noticed from our example, the hash and range keys can be compounded. The format can consist of strings, existing attribute names (e.g., {Host}), environment variables (e.g., %ComputerName%), or any combination of these. Here is an example that combines all possible approaches:

Prod-%ComputerName%-{EventType}

When constructing the format, you can use the following attributes: Callstack, EventId, EventType, Host, Message, ProcessId, Source, ThreadId, Time. These are also the attributes that can be excluded from being logged with the ExcludeAttributes configuration.

Using DynamoDBTraceListener programmatically

Should you need to create and use the listener in your code, this is a simple and straightforward operation. The next sample shows how a to create and invoke a listener.

DynamoDBTraceListener listener = new DynamoDBTraceListener
{
    Configuration = new DynamoDBTraceListener.Configs
    {
        AWSCredentials = new BasicAWSCredentials(accessKey, secretKey),
        Region = RegionEndpoint.USEast1,
        HashKeyFormat = "%ComputerName%-{EventType}"
    }
};
listener.WriteLine("This is a test", "Test Category");
listener.Flush();

Background logging

DynamoDBTraceListener logs events in two separate stages. First, we write the event data to a file on the disk. Then, at periodic intervals, these files are pushed to DynamoDB. We use this approach for a number of reasons, including asynchronous logging and the batching of writes, but most importantly it is done to prevent loss of data if the hosting application terminates unexpectedly. If this happens, we will push any existing log files the next time the application runs and the listener pushes logs to DynamoDB.

Even though the listener writes data to DynamoDB on a periodic basis, it is important to remember to flush the listener or to properly dispose of whatever resources you have that log, such as the client objects in the AWS SDK for .NET. Otherwise, you may find some of your logs are not being uploaded to DynamoDB.

When the listener first starts, we attempt to find a directory for the log files. Three different locations are considered: LogFilesDir, if one is configured by the user; the directory containing the current assembly; the current user’s temporary folder (as resolved by the Path.GetTempPath method). Once a location is determined, an information event is written to the Event Log specifying the current logging location. If none of these locations are available, however, an error event is written to the Event Log and the listener is disabled.

IAM user

For safety, you may not want to put your root account credentials in the application config. A much better approach is to create an IAM user with specific permissions. Below is an example of a policy that limits a user’s permissions to just DynamoDB and only for those operations that the listener actually uses. Furthermore, we’re limiting access to just the log table.

{
  "Statement" : [
    {
      "Effect" : "Allow",
      "Action" : [
        "dynamodb:DescribeTable",
        "dynamodb:CreateTable",
        "dynamodb:BatchWriteItem"
      ],
      "Resource" : "arn:aws:dynamodb:us-west-2:YOUR-ACCOUNT-ID:table/Logs"
    }
  ]
}

IAM Roles

If you are using DynamoDBTraceListener in an environment that is configured with an IAM Role, you can omit the AWSAccessKey and AWSSecretKey parameters from the config file. In this case, DynamoDBTraceListener will access DynamoDB with permissions configured for the IAM Role.

Injecting Failures and Latency using the AWS SDK for Java

by Wade Matveyenko | on | in Java | Permalink | Comments |  Share

Today we have another guest post from a member of the Amazon DynamoDB team, Pejus Das.


The Amazon DynamoDB service provides fast and predictable performance with seamless scalability. It also has a list of common errors that can occur during request processing. You probably have a set of test suites that you run before you release changes to your application. If you’re using DynamoDB in your application, your tests probably call it using an isolated test account, or one of the sets of mock DynamoDB facades out there. This link lists some sample open source libraries, Object-Relational Mappers, and Mock implementations. Or maybe you have a combination of both solutions, with mocking for unit tests and using a test account for integration tests. Either way, your test suite likely covers expected successful scenarios, and expected failure scenarios.

But then there are the other classes of failures that are harder to test for. Amazon DynamoDB is a remote dependency that you call across the network (or possibly even over the internet). A whole class of things can go wrong with this kind of an interaction, and when things do go wrong, your application will behave a lot better if you’ve tested those failure scenarios in advance.

There are many approaches to injecting unexpected failures in your application. For example, you can simulate what happens to your application when DynamoDB returns one of the documented errors returned by DynamoDB. You can also test the impact of high request latencies on your application. Such testing helps to build reliable and robust client applications that gracefully handle service errors and request delays.  In this blog post, we describe another approach: how you can easily inject these kinds of failures into the client application using the AWS SDK for Java.

Request Handlers

The AWS SDK for Java allows you to register request handlers with the DynamoDB Java client. You can attach multiple handlers and they are executed in the order you added them to the client. The RequestHandler interface gives you three hooks into the request execution cycle: beforeRequest, afterRequest, and afterError.

Hook Description
beforeRequest Called just before executing the HTTP request against an AWS service like DynamoDB
afterRequest Called just after the Response is received and processed by the Client
afterError Called if there are any AmazonClientException errors while executing the HTTP request

The RequestHandler hooks give an easy way to inject failures and latencies in the client for testing.

Injecting Failures

The beforeRequest hook provides access to the Request object. You can inspect the Request and take some action based either on the Request or on some other condition. In the following example, we inspect a PutRequest and inject a ProvisionedThroughputExceededException on an average 50 percent of the time.

@Override
public void beforeRequest(Request<?> request) {
    // Things to do just before a request is executed 
    if (request.getOriginalRequest() instanceof PutItemRequest) {
        // Throw throuhgput exceeded exception for 50% of put requests 
        if (rnd.nextInt(2) == 0) {
           logger.info("Injecting ProvisionedThroughputExceededException");
           throw new ProvisionedThroughputExceededException("Injected Error");
        }
    }
    // Add latency to some Get requests 
    if (request.getOriginalRequest() instanceof GetItemRequest) {
        // Delay 50% of GetItem requests by 500 ms 
        if (rnd.nextInt(2) == 0) {
            // Delay on average 50% of the requests from client perspective 
            try {
                logger.info("Injecting 500 ms delay");
                Thread.sleep(500);
            } catch (InterruptedException ie) {
                logger.info(ie);
                throw new RuntimeException(ie);
            }
        }
    }
}

Injecting Latency

You could simply put a sleep in the beforeRequest hook to simulate latencies. If you want to inspect the Response object and inject latencies for specific traffic you would use the afterResponse hook. You can analyze the response data from DynamoDB and act accordingly. In the following example, we inspect for a GetItemRequest and when the item is an Airplane, we modify the item and additionally add a 500 ms delay.

@Override
public void afterResponse(Request<?> request, Object resultObject, TimingInfo timingInfo) {
    // The following is a hit and miss for multi-threaded
    // clients as the cache size is only 50 entries
    String awsRequestId = dynamoDBClient.getCachedResponseMetadata(
                          request.getOriginalRequest()).getRequestId();
    logger.info("AWS RequestID: " + awsRequestId);
    // Here you could inspect and alter the response object to
    // see how your application behaves for specific data
    if (request.getOriginalRequest() instanceof GetItemRequest) {
        GetItemResult result = (GetItemResult) resultObject;
        Map item = result.getItem();
        if (item.get("name").getS().equals("Airplane")) {
            // Alter the item
            item.put("name", new AttributeValue("newAirplane"));
            item.put("new attr", new AttributeValue("new attr"));
            // Add some delay
            try {
                Thread.sleep(500);
            } catch (InterruptedException ie) { 
                logger.info(ie);
                throw new RuntimeException(ie);
            }
        }
    }
}

The preceding code examples are listed on GitHub in the aws-labs repository here.

While this approach simulates increased latency and failures, it is only a simulation based on what you think will be happening during a failure. If you want to test your application for failures even more thoroughly, take a look at the Chaos Monkey and Simian Army applications written by Netflix. These inject actual failures into your system, revealing the interactions between more components in your system than just your application logic. We hope that adding fault injection testing to your application helps you be prepared for failure. Let us know in the comments!

Reference Links

  1. http://aws.typepad.com/aws/2012/04/amazon-dynamodb-libraries-mappers-and-mock-implementations-galore.html
  2. http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ErrorHandling.html
  3. https://github.com/awslabs
  4. https://github.com/awslabs/aws-dynamodb-examples/tree/master/inject-errors-latencies
  5. http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html
  6. http://techblog.netflix.com/2011/07/netflix-simian-army.html

Using Client-Side Encryption for S3 in the AWS SDK for Ruby

by Alex Wood | on | in Ruby | Permalink | Comments |  Share

What is client-side encryption, and why might I want to use it?

If you wish to store sensitive data in Amazon S3 with the AWS SDK for Ruby, you have several ways of managing the safety and security of the data. One good practice is to use HTTPS whenever possible to protect your data in transit. Another is to use S3’s built in server-side encryption to protect your data at rest. In this post, we highlight yet another option, client-side encryption.

Client-side encryption is a little more involved than server-side encryption, since you bring your own security credentials. But, it has the added benefit that data never exists in an unencrypted state outside of your execution environment.

How do I use client-side encryption with the AWS SDK for Ruby?

The SDK for Ruby does most of the heavy lifting for you when using client-side encryption for your S3 objects. When performing read and write operations on S3, you can specify various options in an option hash passed in to the S3Object#write and S3Object#read methods.

One of these options is :encryption_key, which accepts either an RSA key (for asymmetric encryption), or a string (for symmetric encryption). The SDK for Ruby then uses your key to encrypt an auto-generated AES key, which is used to encrypt and decrypt the payload of your message. The encrypted form of your auto-generated key is stored with the headers of your object in S3.

Here is a short example you can try to experiment with client-side encryption yourself:

require 'aws-sdk'
require 'openssl'

# Set your own bucket/key/data values
bucket = 's3-bucket'
key = 's3-object-key'
data = 'secret message'

# Creates a string key - store this!
symmetric_key = OpenSSL::Cipher::AES256.new(:CBC).random_key

options = { :encryption_key => symmetric_key }
s3_object = AWS.s3.buckets[bucket].objects[key]

# Writing an encrypted object to S3
s3_object.write(data, options)

# Reading the object from S3 and decrypting
puts s3_object.read(options)

There are a couple practical matters you should consider. One is that if you lose the key used to encrypt the object, you will be unable to decrypt your contents. You should securely store your key (e.g., as a file or using a separate key management system) and load it when needed for writing or reading objects. Additionally, encryption and decryption on your objects does bring with it some performance overhead, so you should use it only when needed (this overhead varies depending on the size and type of key used).

You can read more about the encryption choices available to you with the AWS SDK for Ruby in our API documentation. You can also read more about general best practices for security in AWS by following the AWS Security Blog. As you consider the choices available for securing your data, we hope you find them effective and simple to use.

Amazon S3 PHP Stream Wrapper

by Michael Dowling | on | in PHP | Permalink | Comments |  Share

As of the 2.3.0 release, the AWS SDK for PHP now provides an official Amazon S3 PHP stream wrapper. The stream wrapper allows you to treat Amazon S3 like a filesystem using functions like fopen(), file_get_contents(), and filesize() through a custom stream wrapper protocol. The Amazon S3 stream wrapper opens up some interesting possibilities that were either previously impossible or difficult to implement.

Registering the stream wrapper

Before you can use the Amazon S3 stream wrapper, you must register it with PHP:

use AwsS3S3Client;

// Create an Amazon S3 client object
$client = S3Client::factory(array(
    'key'    => '[aws access key]',
    'secret' => '[aws secret key]'
));

// Register the stream wrapper from a client object
$client->registerStreamWrapper();

After registering the stream wrapper, you can use various PHP filesystem functions that support custom stream wrapper protocols.

$bucket = 'my_bucket';
$key = 'object_key';

// Get the contents of an object as a string
$contents = file_get_contents("s3://{$bucket}/{$key}");

// Get the size of an object
$size = filesize("s3://{$bucket}/{$key}");

Stream wrappers in PHP are identified by a unique protocol; the Amazon S3 stream wrapper uses the "s3://" protocol. Amazon S3 stream wrapper URIs always start with the "s3://" protocol followed by an optional bucket name, forward slash, and optional object key: s3://bucket/key.

Streaming downloads

The Amazon S3 stream wrapper allows you to truly stream downloads from Amazon S3 using functions like fopen(), fread(), and fclose(). This allows you to read bytes off of a stream as needed rather than downloading an entire stream upfront and then working with the data.

The following example opens a read-only stream, read up to 1024 bytes from the stream, and closes the stream when no more data can be read from it.

// Open a stream in read-only mode
if (!($stream = fopen("s3://{$bucket}/{$key}", 'r'))) {
    die('Could not open stream for reading');
}

// Check if the stream has more data to read
while (!feof($stream)) {
    // Read 1024 bytes from the stream
    echo fread($stream, 1024);
}
// Be sure to close the stream resource when you're done with it
fclose($stream);

Seekable streams

Because no data is buffered in memory, read-only streams with the Amazon S3 stream wrapper are by default not seekable. You can force the stream to allow seeking using the seekable stream context option.

// Create a stream context to allow seeking
$context = stream_context_create(array(
    's3' => array(
        'seekable' => true
    )
));

if ($stream = fopen('s3://bucket/key', 'r', false, $context)) {
    // Read bytes from the stream
    fread($stream, 1024);
    // Seek back to the beginning of the stream
    fseek($stream, 0);
    // Read the same bytes that were previously read
    fread($stream, 1024);
    fclose($stream);
}

Opening seekable streams allows you to seek only to bytes that have been previously read. You cannot skip ahead to bytes that have not yet been read from the remote server. In order to allow previously read data to be recalled, data is buffered in a PHP temp stream using Guzzle’s CachingEntityBody decorator.

Streaming uploads from downloads

You can use an Amazon S3 stream resource with other AWS SDK for PHP operations. For example, you could stream the contents of one Amazon S3 object to a new Amazon S3 object.

$stream = fopen("s3://{$bucket}/{$key}", 'r');

if (!$stream) {
    die('Unable to open stream for reading');
}

$client->putObject(array(
    'Bucket' => 'other_bucket',
    'Key'    => $key,
    'Body'   => $stream
));

fclose($stream);

Uploading data

In addition to downloading data with the stream wrapper, you can use the stream wrapper to upload data as well.

$stream = fopen("s3://{$bucket}/{$key}", 'w');
fwrite($stream, 'Hello!');
fclose($stream);

Note: Because Amazon S3 requires a Content-Length for all entity-enclosing HTTP requests, the contents of an upload must be buffered using a PHP temp stream before it is sent over the wire.

Traversing buckets

You can modify and browse Amazon S3 buckets similar to how PHP allows the modification and traversal of directories on your filesystem.

Here’s an example of creating a bucket:

mkdir('s3://bucket');

You can delete empty buckets using the rmdir() function.

rmdir('s3://bucket');

The opendir(), readdir(), rewinddir(), and closedir() PHP functions can be used with the Amazon S3 stream wrapper to traverse the contents of a bucket.

$dir = "s3://bucket/";

if (is_dir($dir) && ($dh = opendir($dir))) {
    while (($file = readdir($dh)) !== false) {
        echo "filename: {$file} : filetype: " . filetype($dir . $file) . "n";
    }
    closedir($dh);
}

You can recursively list each object and prefix in a bucket using PHP’s RecursiveDirectoryIterator.

$dir = 's3://bucket';
$iterator = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($dir));

foreach ($iterator as $file) {
    echo $file->getType() . ': ' . $file . "n";
}

Using the Symfony2 Finder component

The easiest way to traverse an Amazon S3 bucket using the Amazon S3 stream wrapper is through the Symfony2 Finder component. The Finder component allows you to more easily filter the files that the stream wrapper returns.

require 'vendor/autoload.php';

use SymfonyComponentFinderFinder;

$finder = new Finder();

// Get all files and folders (key prefixes) from "bucket" that are less than
// 100K and have been updated in the last year
$finder->in('s3://bucket')
    ->size('< 100K')
    ->date('since 1 year ago');

foreach ($finder as $file) {
    echo $file->getType() . ": {$file}n";
}

You will need to install the Symfony2 Finder component and add it to your project’s autoloader in order to use it with the AWS SDK for PHP. The most common way to do this is to add the Finder component to your project’s composer.json file. You can find out more about this process in the Composer documentation.

More information

We hope you find the new Amazon S3 stream wrapper useful. You can find more information and documentation on the Amazon S3 stream wrapper in the AWS SDK for PHP User Guide.

AWS SDKs and Tools @ OSCON

by Jason Fulghum | on | in Java | Permalink | Comments |  Share

A few of us from the SDKs and Tools teams will be down in Portland for OSCON next week.

We’ll be at the AWS booth talking to customers, answering questions, and as always, looking for talented engineers, managers, and designers interested in building the future of the AWS platform. If you aren’t able to drop by at OSCON, you can always browse our open positions and apply online.

If you’ll be at the conference, please come by and say hello! We love hearing from our customers, whether it’s feature requests for our tools and services, or if you just want to talk about ideas for new ways to use AWS.

We hope we’ll see some of you in Portland!


OSCON 2013

Happy Birthday, SDK! Now Let’s Celebrate the Future

by Loren Segal | on | in Ruby | Permalink | Comments |  Share

Today marks the second anniversary of the AWS SDK for Ruby. Over the last two years, the SDK has grown and developed to support the full array of available AWS services and high-level features, like resource abstractions, enumeration, as well as Rails email and model layer integration. We are honored by the positive customer feedback we’ve received so far. We hope to continue earning your support as we move forward with the SDK.

One of the things I am personally proud of is the increase in community involvement we’ve received on GitHub within the last year since I joined Amazon. As someone who comes from the open source world, it’s great to see that process working so well in the SDK. The bug reports and pull requests that we’ve gotten from users have been top-notch quality and extremely helpful to everyone else using the gem, and we only want to see that level of engagement get better as time goes on. We want to thank all of our users who have been involved in the process and have helped to improve the SDK.

So here’s to the AWS SDK for Ruby turning 2 years old!

On 2.0 the next one

Of course, having a great SDK with a great community does not mean we should stop innovating, and so today, on its 2nd anniversary, we are also marking the start of development on version 2.0 of the AWS SDK for Ruby. We’re excited to share some of the great ideas we’ve been kicking around that will modernize the SDK and make it even easier to use. More importantly though, we are opening up this dialog because we also want your feedback about the features you believe belong in the next version of the Ruby SDK. There is still time to get your ideas in.

Over the next coming weeks we plan on sharing more information, and code, about version 2.0 of the SDK. If you want a front seat in the development, or even want to help out, watch this space. Until that time, here are some of the things that will be coming to the new version of the SDK:

Memoization by default

Currently, operations called from the high-level abstractions of various services (like Amazon S3’s "bucket collection" resource) are setup to not memoize return values from requests by default. This means that in many cases, your code can end up sending more requests than necessary to get at data you might have already loaded in a previous request. Furthermore, memoization can currently work somewhat inconsistently across services. A great example of this can be illustrated by grabbing user data from an Amazon EC2 instance:

instance = AWS.ec2.instances.first
puts instance.user_data # sends one request
puts instance.user_data # sends ANOTHER request

In version 2.0 of the SDK, we plan on making resources memoize data by default. In other words, when hydrating a resource from a request, that resource will maintain all of the data from the original request. Any further calls on that resource will use only the data from that original request. This will ensure a more consistent experience when dealing with services like EC2, and will improve performance of the SDK in many cases. If you want to explicitly reload fresh data from the service, you will still be able to do so by hydrating a new resource object.

High-level abstractions moved into separate gems

The AWS SDK for Ruby is very extensive, but that also means it is very large. With 30+ supported services, the core SDK gem contains almost 400 classes in over 500 files with more than 26,000 lines of code. That’s a lot of code to manage in just one package. This one package may also contain features that could conflict with your application, like Rails integration and XML libraries that may not be needed, and, in some cases, might require being disabled altogether.

Splitting the SDK into multiple packages will help keep the codebase small and focused while avoiding these integration conflicts. A small core codebase with small extra component libraries also means that contibutors to the SDK will have an easier time navigating the libraries, running tests, and submitting patches. We believe that making it easier for our users to contribute code to the SDK is a feature, not a side effect, of development. Anything we can do to improve the lives of those submitting pull requests is effort well spent.

Built to be extensible: a strong plugin API

In order to support a highly modular SDK with a healthy third-party ecosystem, we need a strong plugin API. Version 2.0 of the AWS SDK for Ruby will be developed with extensibility as a primary concern. More importantly, to help ensure the quality of this API, we plan on eating our own dog food. The plugin architecture, built on top of Seahorse, will be a first class citizen in the new SDK, and will be how we implement all of the core functionality of the library. This means that if you are a third-party developer writing an abstraction for any given service, you should feel comfortable knowing that the APIs you are using will be well supported because they will be the same ones we used to implement the very service you are wrapping.

Dropping support for Ruby 1.8.x

Finally, with the recently announced end of life for Ruby 1.8.7, we believe that it is time to start moving forward, and we will no longer be supporting Ruby 1.8.7 in version 2.0 of the aws-sdk gem. We heard you loud and clear when discussing this issue on GitHub, and we are aware of the maintenance burden that supporting 1.8.7 will bring. We believe that it is in the best interest of our customers to support the latest and greatest versions of Ruby, 1.9.x and 2.0.x.

Note that users on Ruby 1.8.x will still be able to use version 1.0 of the SDK, and we will have more information on how we will continue to support legacy users in upcoming posts.

More to come; help us define the future

Of course there is much more to talk about in this new version of the AWS SDK for Ruby, and we plan on bringing up many of these topics with future posts in the coming weeks. If the features we did manage to talk about sound interesting to you, don’t be shy to get involved. If you think we missed any important details, you should also make your voice heard. This is your opportunity to help us define what the future of the SDK will look like, and we will be listening for your comments. You can get in touch with us either here, the forums, or on GitHub.

Now, let’s take one last moment to say happy birthday to the AWS SDK for Ruby, and another moment to get excited about what’s to come!