AWS Developer Blog

Release: AWS SDK for PHP – Version 2.4.9

by Michael Dowling | on | in PHP | Permalink | Comments |  Share

We would like to announce the release of version 2.4.9 of the AWS SDK for PHP. This release adds support for cross zone load balancing in Elastic Load Balancing, stack policies in AWS CloudFormation, and the Gateway-Virtual Tape Library in AWS Storage Gateway.

Changelog

  • Added support for cross-zone load balancing to the Elastic Load Balancing client.
  • Added support for a new gateway configuration, Gateway-Virtual Tape Library, to the AWS Storage Gateway client.
  • Added support for stack policies to the the AWS CloudFormation client.
  • Fixed issue #176 where attempting to upload a direct to Amazon S3 using the UploadBuilder failed when using a custom iterator that needs to be rewound.

Install/Download the Latest SDK

Configuring Advanced Logging on AWS Elastic Beanstalk

by Jim Flanagan | on | in .NET | Permalink | Comments |  Share

Sometimes developers want more flexibility in logging for their IIS environments. For example, in the IIS log files on instances in a load-balanced AWS Elastic Beanstalk environment, the client IP for requests always appears to be the load balancer. Elastic Load Balancing adds an X-Forwarded-For header to each request that contains the actual client IP address, but there’s no way to log that with IIS’s default logging.

Microsoft has created Advanced Logging to provide more flexibility in logging. You can add Advanced Logging to your Elastic Beanstalk instances by making the MSI available (for example, in an Amazon S3 bucket), then scripting the configuration with Windows PowerShell.

By default, though, Advanced Logging puts its log files in a different location than the default IIS log files, so if you want to see them in Snapshot Logs, or have them published to S3, you need to tell Elastic Beanstalk about it.

We’ll build up an .ebextensions config file that addresses each of these points, then show the completed configuration at the end.

Download and Install Advanced Logging

First, we need to make the Advanced Logging installer available at a well-known location that will persist over the lifetime of our environment, because instances that get autoscaled into the environment will need to download it as well as any instances created when the environment is first brought up. This can be any URL-addressable location. For this example, we will upload the installer to an S3 bucket we control.

After uploading the AdvancedLogging64.msi to an S3 bucket and making it publicly readable, add the following to the config file (for example: .ebextensionsadvancedlogging.config).

files:
  "c:/software/AdvancedLogging64.msi": 
  source: https://my-bucket.s3.amazonaws.com/AdvancedLogging64.msi
commands:
  00-install-advanced-logging:
    command: msiexec /i AdvancedLogging64.msi
    test: cmd /c "if exist c:\software\configured (exit 1) else (exit 0)"
    cwd: c:/software/
    waitAfterCompletion: 0
  02-set-configured:
    command: date /t > c:/software/configured
    waitAfterCompletion: 0

The files: key gets the MSI onto the instance, and the commands: key runs msiexec and then creates a file to signal that the install has been done. The test: subkey makes the command contingent on the non-existence of the signal file, so that the install happens only on the initial deployment, and not on every redeployment.

Configuring Advanced Logging

Advanced Logging is usually configured through the IIS Manager UI, but we can use PowerShell to accomplish this. The steps are:

  • disable IIS logging
  • add the X-Forwarded-For header to the list of possible fields
  • add X-Forwarded-For to the selected fields
  • enable Advanced Logging
  • iisreset

The PowerShell script to do this uses the WebAdministration module:

import-module WebAdministration

Set-WebConfigurationProperty `
  -Filter system.webServer/httpLogging `
  -PSPath machine/webroot/apphost `
  -Name dontlog `
  -Value true

Add-WebConfiguration "system.webServer/advancedLogging/server/fields" `
  -value @{id="X-Forwarded-For";sourceName="X-Forwarded-For";sourceType="RequestHeader";logHeaderName="X-Forwarded-For";category="Default";loggingDataType="TypeLPCSTR"}

$logDefinitions = Get-WebConfiguration "system.webServer/advancedLogging/server/logDefinitions"
foreach ($item in $logDefinitions.Collection) {
    Add-WebConfiguration `
      "system.webServer/advancedLogging/server/logDefinition/logDefinition[@baseFileName='$($item.baseFileName)']/selectedFields" `
      -value @{elementTagName="logField";id="X-Forwarded-For";logHeaderName="";required="false";defaultValue=""}
}

Set-WebConfigurationProperty `
  -Filter system.webServer/advancedLogging/server `
  -PSPath machine/webroot/apphost `
  -Name enabled `
  -Value true

iisreset

We can either put this script in a file and download it like the MSI, or just inline it in the config file, like this (without the added line breaks for readability):

files:
  "c:/software/configureLogging.ps1":
    content: |
      import-module WebAdministration
      Set-WebConfigurationProperty -Filter system.webServer/httpLogging -PSPath machine/webroot/apphost -Name dontlog -Value true
      Add-WebConfiguration "system.webServer/advancedLogging/server/fields" -value @{id="X-Forwarded-For";sourceName="X-Forwarded-For";sourceType="RequestHeader";logHeaderName="X-Forwarded-For";category="Default";loggingDataType="TypeLPCSTR"}
      $logDefinitions = Get-WebConfiguration "system.webServer/advancedLogging/server/logDefinitions"
      foreach ($item in $logDefinitions.Collection) {
        Add-WebConfiguration "system.webServer/advancedLogging/server/logDefinitions/logDefinition[@baseFileName='$($item.baseFileName)']/selectedFields" -value @{elementTagName="logField";id="X-Forwarded-For";logHeaderName="";required="false";defaultValue=""}
      }
      Set-WebConfigurationProperty -Filter system.webServer/advancedLogging/server -PSPath machine/webroot/apphost -Name enabled -Value true
      iisreset
commands:
  01-add-forwarded-header:
    command: Powershell.exe -ExecutionPolicy Bypass -File c:\software\configureLogging.ps1
    test: cmd /c "if exist c:\software\configured (exit 1) else (exit 0)"
    waitAfterCompletion: 0

This snippet creates the script file and executes it.

Configure Elastic Beanstalk Logging

As we mentioned before, the default location for Advanced Logging log files is different from where the IIS logs usually go. In order to get the Advanced Logging log files to show up for Snapshot Logs and log publication, we need to add some configuration files that tell the Snapshot Logs and log publication features where to look for log files. In this case, these files say that all files in C:inetpublogsAdvancedLogs are eligible for snapshotting or log publication.

files:
  "c:/Program Files/Amazon/ElasticBeanstalk/config/publogs.d/adv-logging.conf":
    content: |
      C:inetpublogsAdvancedLogs
  "c:/Program Files/Amazon/ElasticBeanstalk/config/taillogs.d/adv-logging.conf":
    content: |
      C:inetpublogsAdvancedLogs

Combining all of the above snippets into a single configuration file looks like this:

files:
  "c:/software/AdvancedLogging64.msi": 
    source: https://my-bucket.s3.amazonaws.com/AdvancedLogging64.msi
  "c:/Program Files/Amazon/ElasticBeanstalk/config/publogs.d/adv-logging.conf":
    content: |
      C:inetpublogsAdvancedLogs
  "c:/Program Files/Amazon/ElasticBeanstalk/config/taillogs.d/adv-logging.conf":
    content: |
      C:inetpublogsAdvancedLogs
  "c:/software/configureLogging.ps1":
    content: |
      import-module WebAdministration
      Set-WebConfigurationProperty -Filter system.webServer/httpLogging -PSPath machine/webroot/apphost -Name dontlog -Value true
      Add-WebConfiguration "system.webServer/advancedLogging/server/fields" -value @{id="X-Forwarded-For";sourceName="X-Forwarded-For";sourceType="RequestHeader";logHeaderName="X-Forwarded-For";category="Default";loggingDataType="TypeLPCSTR"}
      $logDefinitions = Get-WebConfiguration "system.webServer/advancedLogging/server/logDefinitions"
      foreach ($item in $logDefinitions.Collection) {
        Add-WebConfiguration "system.webServer/advancedLogging/server/logDefinitions/logDefinition[@baseFileName='$($item.baseFileName)']/selectedFields" -value @{elementTagName="logField";id="X-Forwarded-For";logHeaderName="";required="false";defaultValue=""}
      }
      Set-WebConfigurationProperty -Filter system.webServer/advancedLogging/server -PSPath machine/webroot/apphost -Name enabled -Value true
      iisreset
commands:
  00-install-advanced-logging:
    command: msiexec /i AdvancedLogging64.msi
    test: cmd /c "if exist c:\software\configured (exit 1) else (exit 0)"
    cwd: c:/software/
    waitAfterCompletion: 0
  01-add-forwarded-header:
    command: Powershell.exe -ExecutionPolicy Bypass -File c:\software\configureLogging.ps1
    test: cmd /c "if exist c:\software\configured (exit 1) else (exit 0)"
    waitAfterCompletion: 0
  02-set-configured:
    command: date /t > c:/software/configured
    waitAfterCompletion: 0

For more information about how to customize Elastic Beanstalk environments, see the AWS Elastic Beanstalk Developer Guide.

AWS OpsWorks for Java

by Andrew Fitz Gibbon | on | in Java | Permalink | Comments |  Share

Today, we have a guest post by Chris Barclay from the AWS OpsWorks team.


We are pleased to announce that AWS OpsWorks now supports Java applications. AWS OpsWorks is an application management service that makes it easy to model and manage your entire application. You can start from templates for common technologies, or build your own using Chef recipes with full control of deployments, scaling, monitoring, and automation of each component.

The new OpsWorks Java layer automatically configures Amazon EC2 instances with Apache Tomcat using sensible defaults in order to run your Java application. You can deploy one or more Java apps, such as a front-end web server and back-end business logic, on the same server. You can also customize or extend the Java layer. For example, you can choose a different Tomcat version, change the heap size, or use a different JDK.

To get started, go to the OpsWorks console and create a stack. Next, add a Java layer. In the navigation column, click Instances, add an instance, and start it.

Add Layer

Tomcat supports HTML, Java server pages (JSP), and Java class files. In this example, we’ll deploy a simple JSP that prints the date and your Amazon EC2 instance’s IP address, scale the environment using a load balancer, and discuss how OpsWorks can automate other tasks.

<%@ page import="java.net.InetAddress" %>
<html>
<body>
<%
    java.util.Date date = new java.util.Date();
    InetAddress inetAddress = InetAddress.getLocalHost();
%>
The time is 
<%
    out.println( date );
    out.println("<br>Your server's hostname is "+inetAddress.getHostName());
%>
<br>
</body>
</html>

A typical Java development process includes developing and testing your application locally, checking the source code into a repository, and deploying the built assets to your servers. The example has only one JSP file, but your application might have many files. You can handle that case by creating an archive of those files and directing OpsWorks to deploy the contents of the archive.

Let’s create an archive of the JSP and upload that archive to a location that OpsWorks can access. This archive will have only one file, but you can use the same procedure for any number of files.

To create an archive and upload it to Amazon S3

  1. Copy the example code to a file named simplejsp.jsp, and put the file in a directory named simplejsp.
  2. Create a .zip archive of the simplejsp directory.
  3. Create a public Amazon S3 bucket, upload simplejsp.zip to the bucket, and make the file public. For a description of how to perform this task, see Get Started With Amazon Simple Storage Service.

To add and deploy the app

  1. In the navigation column, click Apps, and the click Add an app.
  2. In Settings, specify a name and select the Java App Type.
  3. In Application Source, specify the http archive repository type, and enter the URL for the archive that you just uploaded in S3. It should look something like http://s3.amazonaws.com/your-bucket/simplejsp.zip.
  4. Then click Add App.

Add App

  1. Next, click deploy to deploy the app to your instance. The deployment causes OpsWorks to download the file from S3 to the appropriate location on the Java app server.
  2. Once the deployment is complete, click the Instances page and copy the public IP address to construct a URL as follows: http://publicIP/appShortName/appname.jsp.

Instances

For the example, the URL will look something like http://54.205.11.166/myjavaapp/simplejsp.jsp and when you navigate to the URL you should see something like:

Wed Oct 30 21:06:07 UTC 2013
Your server’s hostname is java-app1

Now that you have one instance running, you can scale the app to handle load spikes using time and load-based instance scaling.

To scale the app to handle load spikes

  1. Under Instances in the left menu, click Time-based and add an instance. You can then select the times that the instance will start and stop. Once you have multiple instances, you will probably want to load balance the traffic among them.
  2. In the Amazon EC2 console, create an Elastic Load Balancer and add it to your OpsWorks layer. OpsWorks automatically updates the load balancer’s configuration when instances are started and stopped.

It’s easy to customize OpsWorks to change the configuration of your EC2 instances. Most settings can be changed directly through the layer settings, such as adding software packages or Amazon EBS volumes. You can change how software is installed using Bash scripts and Chef recipes. You can also change existing recipes by modifying attributes. For example, you can use a different JDK by modifying the stack’s custom JSON:

{
  "opsworks_java" : {
    "jvm_pkg" : {
       "use_custom_pkg_location" : "true",
       "custom_pkg_location_url_rhel" :
           "http://s3.amazonaws.com/your-bucket/jre-7u45-linux-x64.gz"
    }
  }
} 

A few clicks in the AWS Management Console are all it takes to get started with OpsWorks. For more information on using the Java layer or customizing OpsWorks, see the documentation.

Specifying Conditional Constraints with Amazon DynamoDB Mapper

by Jason Fulghum | on | in Java | Permalink | Comments |  Share

Conditional constraints are a powerful feature in the Amazon DynamoDB API. Until recently, there was little support for them in the Amazon DynamoDB Mapper. You could specify a version attribute for your mapped objects, and the mapper would automatically apply conditional constraints to give you optimistic locking, but you couldn’t explicitly specify your own custom conditional constraints with the mapper.

We’re excited to show you a new feature in the DynamoDB Mapper that lets you specify your own conditional constraints when saving and deleting data using the mapper. Specifying conditional constraints with the mapper is easy. Just pass in a DynamoDBSaveExpression object that describes your conditional constraints when you call DynamoDBMapper.save. If the conditions are all met when they’re evaluated on the server side, then your data will be saved, but if any of the conditions are not met, you’ll receive an exception in your application, letting you know about the conditional check failure.

Consider an application with a fleet of backend workers. When a worker starts processing a task, it marks the task’s status as IN_PROGRESS in a DynamoDB table. We want to prevent the case where multiple workers start working on the same task. We can do that easily with the new support for conditional constraints in the DynamoDB Mapper. When a worker attempts to set a task to IN_PROGRESS, the mapper simply adds a constraint that the previous state for the task should be READY. That way, if two workers try to start working on the task at the same time, the first one will be able to set it’s status to IN_PROGRESS and start processing the task, but the second worker will receive a ConditionalCheckFailedException since the status field wasn’t what it expected when it saved its data.

Here’s what the code looks like:

try {
   DynamoDBSaveExpression saveExpression = new DynamoDBSaveExpression();
   Map expected = new HashMap();
   expected.put("status", 
      new ExpectedAttributeValue(new AttributeValue("READY").withExists(true));

   saveExpression.setExpected(expected);

   mapper.save(obj, saveExpression);
} catch (ConditionalCheckFailedException e) {
   // This means our save wasn't recorded, since our constraint wasn't met
   // If this happens, the worker can simply look for a new task to work on
}

AWS at ZendCon 2013

by Jeremy Lindblom | on | in PHP | Permalink | Comments |  Share

Recently, the AWS SDK for PHP team attended ZendCon, the largest conference in the U.S. that focuses on PHP development. AWS was a sponsor for ZendCon this year, so the entire PHP SDK team was able to attend. It was great to be able to talk to our customers and get feedback from those who have used AWS and our SDK. We want you to know that we are thankful for your feedback and that we’ve shared it all with various teams at AWS.

What You Heard About AWS at the Conference

It was apparent to everyone at ZendCon this year that "The Cloud" was a very hot topic. There were also many speakers that spoke about or mentioned AWS in their sessions and keynotes. I will call attention to a few of these in case you want to look back and read the slides or watch the videos.

Integrating Zend and AWS

Zend is the company that hosts ZendCon, and has helped manage the development of the PHP language since PHP 3. They also create commercial PHP products and services like Zend Studio and Zend Server. Though there are several ways in which Zend products and services integrate with AWS, I want to specifically call out two of them:

  1. Zend Server on AWS Marketplace – If you are a Zend Server user and an AWS customer, you can easily launch Zend Server on Amazon EC2 through AWS Marketplace.
  2. AWS SDK Module for Zend Framework – If you write PHP applications with Zend Framework 2, you can use the AWS SDK ZF2 Module to easily integrate the AWS SDK for PHP into your application. To learn about how to install and use the module, see the AWS SDK ZF2 Module README.

Until Next Year…

We enjoyed being at ZendCon and connecting with you. We also hope that you were able to learn more about AWS while you were there, or if you couldn’t make it, through the presentation slides afterward. If you need any help using the AWS SDK for PHP or have any feedback for us, be sure to connect with us via the PHP SDK forum or our GitHub repo.

Providing credentials to the AWS SDK for PHP

by Michael Dowling | on | in PHP | Permalink | Comments |  Share

In order to authenticate requests, the AWS SDK for PHP requires credentials in the form of an AWS access key ID and secret access key. In this post, we’ll discuss how to configure credentials in the AWS SDK for SDK.

Configuring credentials

There are several methods that can be used for configuring credentials in the SDK. The method that you use to supply credentials to your application is up to you, but we recommend that you use IAM roles when running on Amazon EC2 or use environment variables when running elsewhere.

Credentials can be specified in several ways:

  1. IAM roles (Amazon EC2 only)
  2. Environment variables
  3. Configuration file and the service builder
  4. Passing credentials into a client factory method

If you do not provide credentials to the SDK using a factory method or a service builder configuration file, the SDK checks if the AWS_ACCESS_KEY_ID and AWS_SECRET_KEY environment variables are present. If defined, these values are used as your credentials. If these environment variables are not found, the SDK attempts to retrieve IAM role credentials from an Amazon EC2 instance metadata server. If your application is running on Amazon EC2 and the instance metadata server responds successfully with credentials, they are used to authenticate requests. If none of the above methods successfully yield credentials, an AwsCommonExceptionInstanceProfileCredentialsException exception is thrown.

IAM roles (Amazon EC2 only)

IAM roles are the preferred method for providing credentials to applications running on Amazon EC2. IAM roles remove the need to worry about credential management from your application. They allow an instance to "assume" a role by retrieving temporary credentials from the instance’s metadata server. These temporary credentials allow access to the actions and resources that the role’s policy allows.

When launching an EC2 instance, you can choose to associate it with an IAM role. Any application running on that EC2 instance is then allowed to assume the associated role. Amazon EC2 handles all the legwork of securely authenticating instances to the IAM service to assume the role and periodically refreshing the retrieved role credentials, keeping your application secure with almost no work on your part.

If you do not provide credentials and no environment variable credentials available, the SDK attempts to retrieve IAM role credentials from an Amazon EC2 instance metadata server. These credentials are available only when running on Amazon EC2 instances that have been configured with an IAM role.

Caching IAM role credentials

While using IAM role credentials is the preferred method for providing credentials to an application running on an Amazon EC2 instance, the roundtrip from the application to the instance metadata server on each request can introduce latency. In these situations, you might find that utilizing a caching layer on top of your IAM role credentials can eliminate the introduced latency.

The easiest way to add a cache to your IAM role credentials is to specify a credentials cache using the credentials.cache option in a client’s factory method or in a service builder configuration file. The credentials.cache configuration setting should be set to an object that implements Guzzle’s GuzzleCacheCacheAdapterInterface. This interface provides an abstraction layer over various cache backends, including Doctrine Cache, Zend Framework 2 cache, etc.

<?php

require 'vendor/autoload.php';

use DoctrineCommonCacheFilesystemCache;
use GuzzleCacheDoctrineCacheAdapter;

// Create a cache adapter that stores data on the filesystem
$cacheAdapter = new DoctrineCacheAdapter(new FilesystemCache('/tmp/cache'));

// Provide a credentials.cache to cache credentials to the file system
$s3 = AwsS3S3Client::factory(array(
    'credentials.cache' => $cacheAdapter
));

With the addition of credentials.cache, credentials are now cached to the local filesystem using Doctrine’s caching system. Every request that uses this cache adapter first checks if the credentials are in the cache. If the credentials are found in the cache, the client then ensures that the credentials are not expired. In the event that cached credentials become expired, the client automatically refreshes the credentials on the next request and populates the cache with the updated credentials.

A credentials cache can also be used in a service builder configuration:

<?php

// File saved as /path/to/custom/config.php

use DoctrineCommonCacheFilesystemCache;
use GuzzleCacheDoctrineCacheAdapter;

$cacheAdapter = new DoctrineCacheAdapter(new FilesystemCache('/tmp/cache'));

return array(
    'includes' => array('_aws'),
    'services' => array(
        'default_settings' => array(
            'params' => array(
                'credentials.cache' => $cacheAdapter
            )
        )
    )
);

If you were to use the above configuration file with a service builder, then all of the clients created through the service builder would utilize a shared credentials cache object.

Environment variables

If you do not provide credentials to a client’s factory method or via a service builder configuration, the SDK attempts to find credentials in your environment by checking in the $_SERVER superglobal and using the getenv() function, looking for the AWS_ACCESS_KEY_ID and AWS_SECRET_KEY environment variables.

If you are hosting your application on AWS Elastic Beanstalk, you can set the AWS_ACCESS_KEY_ID and AWS_SECRET_KEY environment variables through the AWS Elastic Beanstalk console so that the SDK can use those credentials automatically.

Configuration file and the service builder

The SDK provides a service builder that can be used to share configuration values across multiple clients. The service builder allows you to specify default configuration values (e.g., credentials and regions) that are applied to every client. The service builder is configured using either JSON configuration files or PHP scripts that return an array.

Here’s an example of a configuration script that returns an array of configuration data that can be used by the service builder:

<?php

// File saved as /path/to/custom/config.php

return array(
    // Bootstrap the configuration file with AWS specific features
    'includes' => array('_aws'),
    'services' => array(
        // All AWS clients extend from 'default_settings'. Here we are
        // overriding 'default_settings' with our default credentials and
        // providing a default region setting.
        'default_settings' => array(
            'params' => array(
                'key'    => 'your-aws-access-key-id',
                'secret' => 'your-aws-secret-access-key',
                'region' => 'us-west-1'
            )
        )
    )
);

After creating and saving the configuration file, you need to instantiate a service builder.

<?php

// Assuming the SDK was installed via Composer
require 'vendor/autoload.php';

use AwsCommonAws;

// Create the AWS service builder, providing the path to the config file
$aws = Aws::factory('/path/to/custom/config.php');

At this point, you can now create clients using the get() method of the Aws object:

$s3 = $aws->get('s3');

Passing credentials into a factory method

A simple way to specify your credentials is by injecting them directly into the factory method of a client. This is useful for quick scripting, but be careful to not hard-code your credentials inside of your applications. Hard-coding your credentials inside of an application can be dangerous because it is easy to commit your credentials into an SCM repository, potentially exposing your credentials to more people than intended. It can also make it difficult to rotate credentials in the future.

<?php

$s3 = AwsS3S3Client::factory(array(
    'key'    => 'my-access-key-id',
    'secret' => 'my-secret-access-key'
));

AWS SDK for Ruby Core Developer Preview

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

A few months ago, Loren blogged about the upcoming version 2 of the AWS SDK for Ruby. Shortly after that, we published our work-in-progress code to GitHub as aws/aws-sdk-core-ruby. I am happy to announce that AWS SDK Core has stabilized enough to enter a developer preview period. It currently supports 30+ services.

To install AWS SDK Core from Rubygems:

gem install aws-sdk-core --pre

Or with Bundler:

gem 'aws-sdk-core'

What is AWS SDK Core?

Version 2 of the Ruby SDK will separate the higher level service abstractions from the service clients. We are focusing our initial efforts to ensure the new client libraries are robust, extensible, and more capable than those in version 1. We are still exploring how best to migrate higher level abstractions from version 1 into version 2.

AWS SDK Core uses a different namespace. This allows you to install both aws-sdk and aws-sdk-core in the same application.

Versioning Strategy

AWS SDK Core is being released as version ‘2.0.0-rc.1’. This shows our intention that core will be the successor to the current Ruby SDK.

Links of Interest

New Twitter Account for the AWS SDK for PHP

by Jeremy Lindblom | on | in PHP | Permalink | Comments |  Share

Last week, we opened up a new Twitter account to run in parallel with this blog. You can find us there at
@awsforphp. We will tweet about new posts on the blog, tips and tricks for
using the SDK, new releases, and upcoming conferences and events that we’ll be attending.

Be sure to follow us to keep up-to-date. :-)

AWS SDK for Ruby and Nokogiri

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

In two weeks, on November 19, 2013, we will be removing the upper bound from the nokogiri gem dependency in version 1 of the AWS SDK for Ruby. We’ve been discussing this change with users for a while on GitHub.

Why Is There Currently an Upper Bound?

Nokogiri removed support for Ruby 1.8 with the release of version 1.6. The Ruby SDK requires Nokogiri and supports Ruby 1.8.7+. The current restriction ensures that Ruby 1.8 users can install the aws-sdk gem.

Why Remove the Upper Bound?

Users of the Ruby SDK that use Ruby 1.9+ have been requesting the upper bound removed. Some want to access features of Nokogiri 1.6+, while others are having headaches with dependency management when multiple libraries require Nokogiri and the versions are exclusive.

The Ruby SDK has been tested against Nokogiri 1.4+. By removing this restriction, we allow end users to choose the version of Nokogiri that best suits their needs. Libraries that have narrow dependencies can make managing co-dependencies difficult. We want to help remove some of that headache.

Will it Still be Possible to Install the Ruby SDK in Ruby 1.8.7?

Yes, it will still be possible to install the Ruby SDK in Ruby 1.8.7 when the restriction is removed. If your target environment already has Nokogiri < 1.6 installed, then you don’t need to do anything. Otherwise, you will need to install Nokogiri before installing the aws-sdk gem.

If you are using bundler to install the aws-sdk gem, add an entry for Nokogiri:

gem 'aws-sdk'
gem 'nokogiri', '< 1.6'

If you are using the gem command to install the Ruby SDK, you must ensure that a compatible version of Nokogiri is present in Ruby 1.8.7.

gem install nokogiri --version="<1.6"
gem install aws-sdk

You should not need to make any changes if any of the following are true:

  • You have a Gemfile.lock deployed with your application
  • Nokogiri < 1.6 is already installed in your target environment
  • Your deployed environment does not reinstall gems when restarted

What About Version 2 of the Ruby SDK?

Version 2 of the Ruby SDK no longer relies directly on Nokogiri. Instead it has dependencies on multi_xml and multi_json. This allows you to use Ox and Oj for speed, or Nokogiri if you prefer. Additionally, the Ruby SDK will work with pure Ruby XML and JSON parsing libraries which makes distributing to varied environments much simpler.

Keep the Feedback Coming

We appreciate all of the users who have chimed in on this issue and shared their feedback. User feedback definitely helps guide development of the Ruby SDK. If you haven’t had a chance to check out our work on v2 of the Ruby SDK, please take a look at the GitHub repo for AWS Ruby Core.

Thanks!

Working with Amazon S3 Object Versions and the AWS SDK for .NET

by Norm Johanson | on | in .NET | Permalink | Comments |  Share

Amazon S3 allows you to enable versioning for a bucket. You can enable or disable versioning with the SDK by calling the PutBucketVersioning method. Note, all code samples were written for our new version 2 of the SDK. Users of version 1 of the SDK will notice some slight name changes.

s3Client.PutBucketVersioning(new PutBucketVersioningRequest
{
    BucketName = versionBucket,
    VersioningConfig = new S3BucketVersioningConfig() { Status = VersionStatus.Enabled }
});

Once versioning is enabled, every PutObject call with the same key will add a new version of the object with a different version ID instead of overwriting the object. For example, running the code below will create three versions of the "sample.txt" object. The sleeps are added to give a more obvious difference in the timestamps.

var putRequest = new PutObjectRequest
{
    BucketName = versionBucket,
    Key = "sample.txt",
    ContentBody = "Content For Version 1"
};

s3Client.PutObject(putRequest);

Thread.Sleep(TimeSpan.FromSeconds(10));

s3Client.PutObject(new PutObjectRequest
{
    BucketName = versionBucket,
    Key = "sample.txt",
    ContentBody = "Content For Version 2"
});

Thread.Sleep(TimeSpan.FromSeconds(10));

s3Client.PutObject(new PutObjectRequest
{
    BucketName = versionBucket,
    Key = "sample.txt",
    ContentBody = "Content For Version 3"
});

Now, if you call the GetObject method without specifying a version ID like this:

var getRequest = new GetObjectRequest
{
    BucketName = versionBucket,
    Key = "sample.txt"
};

using (GetObjectResponse getResponse = s3Client.GetObject(getRequest))
using (StreamReader reader = new StreamReader(getResponse.ResponseStream))
{
    Console.WriteLine(reader.ReadToEnd());
}

// Outputs:
Content For Version 3

It will print out the contents of the last object that was put into the bucket.

Use the ListVersions method to get the list of versions.

var listResponse = s3Client.ListVersions(new ListVersionsRequest
{
    BucketName = versionBucket,
    Prefix = "sample.txt"                    
});

foreach(var version in listResponse.Versions)
{
    Console.WriteLine("Key: {0}, Version ID: {1}, IsLatest: {2}, Modified: {3}", 
        version.Key, version.VersionId, version.IsLatest, version.LastModified);
}

// Output:
Key: sample.txt, Version ID: nx5sVCpUSdpHzPBpOICF.eELc2nUsm3c, IsLatest: True, Modified: 10/29/2013 4:45:07 PM
Key: sample.txt, Version ID: LOgcIIrvtM0ZqYfkvfRz3UMdgdmRXNWE, IsLatest: False, Modified: 10/29/2013 4:44:56 PM
Key: sample.txt, Version ID: XxnZRKXHZ7cHYiogeCHXXxccojj9DLK5, IsLatest: False, Modified: 10/29/2013 4:44:46 PM

To get a specific version of an object, you simply need to specify the VersionId property when performing a GetObject.

var earliestVersion = listResponse.Versions.OrderBy(x => x.LastModified).First();

var getRequest = new GetObjectRequest
{
    BucketName = versionBucket,
    Key = "sample.txt",
    VersionId = earliestVersion.VersionId
};

using(GetObjectResponse getResponse = s3Client.GetObject(getRequest))
using(StreamReader reader = new StreamReader(getResponse.ResponseStream))
{
    Console.WriteLine(reader.ReadToEnd());
}

// Outputs:
Content For Version 1

Deleting an object that is versioned works differently than the non-versioned objects. If you call delete like this:

s3Client.DeleteObject(new DeleteObjectRequest
{
    BucketName = versionBucket,
    Key = "sample.txt"
});

and then try to do a GetObject for the "sample.txt" object, S3 will return an error that the object doesn’t exist. What S3 actually does when you call delete for a versioned object is insert a delete marker. You can see this if you list the versions again.

var  listResponse = s3Client.ListVersions(new ListVersionsRequest
{
    BucketName = versionBucket,
    Prefix = "sample.txt"                    
});

foreach (var version in listResponse.Versions)
{
    Console.WriteLine("Key: {0}, Version ID: {1}, IsLatest: {2}, IsDeleteMarker: {3}", 
        version.Key, version.VersionId, version.IsLatest, version.IsDeleteMarker);
}

// Outputs:
Key: sample.txt, Version ID: YRsryuUODxDujL4Y4iJjRLKweHrV0t2U, IsLatest: True, IsDeleteMarker: True
Key: sample.txt, Version ID: nx5sVCpUSdpHzPBpOICF.eELc2nUsm3c, IsLatest: False, IsDeleteMarker: False
Key: sample.txt, Version ID: LOgcIIrvtM0ZqYfkvfRz3UMdgdmRXNWE, IsLatest: False, IsDeleteMarker: False
Key: sample.txt, Version ID: XxnZRKXHZ7cHYiogeCHXXxccojj9DLK5, IsLatest: False, IsDeleteMarker: False

If you want to delete a specific version of an object, when calling DeleteObject, set the VersionId property. This is also how you can restore an object by deleting the delete marker.

var deleteMarkerVersion = listResponse.Versions.FirstOrDefault(x => x.IsDeleteMarker && x.IsLatest);
if (deleteMarkerVersion != null)
{
    s3Client.DeleteObject(new DeleteObjectRequest
    {
        BucketName = versionBucket,
        Key = "sample.txt",
        VersionId = deleteMarkerVersion.VersionId
    });
}

Now, calls to GetObject for the "sample.txt" object will succeed again.