Category: Ruby


Using Client-Side Encryption for S3 in the AWS SDK for Ruby

by Alex Wood | on | in Ruby | Permalink | Comments |  Share

What is client-side encryption, and why might I want to use it?

If you wish to store sensitive data in Amazon S3 with the AWS SDK for Ruby, you have several ways of managing the safety and security of the data. One good practice is to use HTTPS whenever possible to protect your data in transit. Another is to use S3’s built in server-side encryption to protect your data at rest. In this post, we highlight yet another option, client-side encryption.

Client-side encryption is a little more involved than server-side encryption, since you bring your own security credentials. But, it has the added benefit that data never exists in an unencrypted state outside of your execution environment.

How do I use client-side encryption with the AWS SDK for Ruby?

The SDK for Ruby does most of the heavy lifting for you when using client-side encryption for your S3 objects. When performing read and write operations on S3, you can specify various options in an option hash passed in to the S3Object#write and S3Object#read methods.

One of these options is :encryption_key, which accepts either an RSA key (for asymmetric encryption), or a string (for symmetric encryption). The SDK for Ruby then uses your key to encrypt an auto-generated AES key, which is used to encrypt and decrypt the payload of your message. The encrypted form of your auto-generated key is stored with the headers of your object in S3.

Here is a short example you can try to experiment with client-side encryption yourself:

require 'aws-sdk'
require 'openssl'

# Set your own bucket/key/data values
bucket = 's3-bucket'
key = 's3-object-key'
data = 'secret message'

# Creates a string key - store this!
symmetric_key = OpenSSL::Cipher::AES256.new(:CBC).random_key

options = { :encryption_key => symmetric_key }
s3_object = AWS.s3.buckets[bucket].objects[key]

# Writing an encrypted object to S3
s3_object.write(data, options)

# Reading the object from S3 and decrypting
puts s3_object.read(options)

There are a couple practical matters you should consider. One is that if you lose the key used to encrypt the object, you will be unable to decrypt your contents. You should securely store your key (e.g., as a file or using a separate key management system) and load it when needed for writing or reading objects. Additionally, encryption and decryption on your objects does bring with it some performance overhead, so you should use it only when needed (this overhead varies depending on the size and type of key used).

You can read more about the encryption choices available to you with the AWS SDK for Ruby in our API documentation. You can also read more about general best practices for security in AWS by following the AWS Security Blog. As you consider the choices available for securing your data, we hope you find them effective and simple to use.

Happy Birthday, SDK! Now Let’s Celebrate the Future

by Loren Segal | on | in Ruby | Permalink | Comments |  Share

Today marks the second anniversary of the AWS SDK for Ruby. Over the last two years, the SDK has grown and developed to support the full array of available AWS services and high-level features, like resource abstractions, enumeration, as well as Rails email and model layer integration. We are honored by the positive customer feedback we’ve received so far. We hope to continue earning your support as we move forward with the SDK.

One of the things I am personally proud of is the increase in community involvement we’ve received on GitHub within the last year since I joined Amazon. As someone who comes from the open source world, it’s great to see that process working so well in the SDK. The bug reports and pull requests that we’ve gotten from users have been top-notch quality and extremely helpful to everyone else using the gem, and we only want to see that level of engagement get better as time goes on. We want to thank all of our users who have been involved in the process and have helped to improve the SDK.

So here’s to the AWS SDK for Ruby turning 2 years old!

On 2.0 the next one

Of course, having a great SDK with a great community does not mean we should stop innovating, and so today, on its 2nd anniversary, we are also marking the start of development on version 2.0 of the AWS SDK for Ruby. We’re excited to share some of the great ideas we’ve been kicking around that will modernize the SDK and make it even easier to use. More importantly though, we are opening up this dialog because we also want your feedback about the features you believe belong in the next version of the Ruby SDK. There is still time to get your ideas in.

Over the next coming weeks we plan on sharing more information, and code, about version 2.0 of the SDK. If you want a front seat in the development, or even want to help out, watch this space. Until that time, here are some of the things that will be coming to the new version of the SDK:

Memoization by default

Currently, operations called from the high-level abstractions of various services (like Amazon S3’s "bucket collection" resource) are setup to not memoize return values from requests by default. This means that in many cases, your code can end up sending more requests than necessary to get at data you might have already loaded in a previous request. Furthermore, memoization can currently work somewhat inconsistently across services. A great example of this can be illustrated by grabbing user data from an Amazon EC2 instance:

instance = AWS.ec2.instances.first
puts instance.user_data # sends one request
puts instance.user_data # sends ANOTHER request

In version 2.0 of the SDK, we plan on making resources memoize data by default. In other words, when hydrating a resource from a request, that resource will maintain all of the data from the original request. Any further calls on that resource will use only the data from that original request. This will ensure a more consistent experience when dealing with services like EC2, and will improve performance of the SDK in many cases. If you want to explicitly reload fresh data from the service, you will still be able to do so by hydrating a new resource object.

High-level abstractions moved into separate gems

The AWS SDK for Ruby is very extensive, but that also means it is very large. With 30+ supported services, the core SDK gem contains almost 400 classes in over 500 files with more than 26,000 lines of code. That’s a lot of code to manage in just one package. This one package may also contain features that could conflict with your application, like Rails integration and XML libraries that may not be needed, and, in some cases, might require being disabled altogether.

Splitting the SDK into multiple packages will help keep the codebase small and focused while avoiding these integration conflicts. A small core codebase with small extra component libraries also means that contibutors to the SDK will have an easier time navigating the libraries, running tests, and submitting patches. We believe that making it easier for our users to contribute code to the SDK is a feature, not a side effect, of development. Anything we can do to improve the lives of those submitting pull requests is effort well spent.

Built to be extensible: a strong plugin API

In order to support a highly modular SDK with a healthy third-party ecosystem, we need a strong plugin API. Version 2.0 of the AWS SDK for Ruby will be developed with extensibility as a primary concern. More importantly, to help ensure the quality of this API, we plan on eating our own dog food. The plugin architecture, built on top of Seahorse, will be a first class citizen in the new SDK, and will be how we implement all of the core functionality of the library. This means that if you are a third-party developer writing an abstraction for any given service, you should feel comfortable knowing that the APIs you are using will be well supported because they will be the same ones we used to implement the very service you are wrapping.

Dropping support for Ruby 1.8.x

Finally, with the recently announced end of life for Ruby 1.8.7, we believe that it is time to start moving forward, and we will no longer be supporting Ruby 1.8.7 in version 2.0 of the aws-sdk gem. We heard you loud and clear when discussing this issue on GitHub, and we are aware of the maintenance burden that supporting 1.8.7 will bring. We believe that it is in the best interest of our customers to support the latest and greatest versions of Ruby, 1.9.x and 2.0.x.

Note that users on Ruby 1.8.x will still be able to use version 1.0 of the SDK, and we will have more information on how we will continue to support legacy users in upcoming posts.

More to come; help us define the future

Of course there is much more to talk about in this new version of the AWS SDK for Ruby, and we plan on bringing up many of these topics with future posts in the coming weeks. If the features we did manage to talk about sound interesting to you, don’t be shy to get involved. If you think we missed any important details, you should also make your voice heard. This is your opportunity to help us define what the future of the SDK will look like, and we will be listening for your comments. You can get in touch with us either here, the forums, or on GitHub.

Now, let’s take one last moment to say happy birthday to the AWS SDK for Ruby, and another moment to get excited about what’s to come!

Release v1.12.0

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

We just released v1.12.0 of the AWS SDK for Ruby  (aws-sdk gem).  This release includes the new aws-rb REPL that Loren bloged about. It also adds support for watermarks and max frame rates in Amazon Elastic Transcoder, resolves a number of issues, and it adds a few new configuration options.

We are slowly deprecating all of the service prefixed configuration options.  This release makes it easier as you can now group configuration options by the service.

# deprecated
AWS.config(:s3_region => 'us-west-2')

# new format
AWS.config(:s3 => {
  :region => 'us-west-2'
})

We are going to make it possible to set any configuration options per service as well. This will allow you to do things like specify a greater value for max retries for a single service.

You can read the release notes here.

Ruby 1.8 End of Life Plan

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

You have probably heard that Ruby 1.8.7 has officially reached it’s end of life. This makes it important for us to discuss what our plans will be for the AWS SDK for Ruby (aws-sdk gem) with regards to Ruby 1.8.

We currently support as far back as Ruby 1.8.7. There are now additional considerations with the passing of the end of life date. We would love to hear your feedback. There is an issue on our GitHub project where the discussion has already started. Please join in and share your input!

A New Addition to the AWS SDK for Ruby

by Loren Segal | on | in Ruby | Permalink | Comments |  Share

Last week we quietly welcomed a new addition to the AWS SDK for Ruby organization. We’re proud to publicly announce that Alex Wood has joined our team and is now a core contributor to the Ruby SDK, as well as some of our other ongoing Ruby-based projects. He’s already jumped in on GitHub where he has helped us close a bunch of open issues on the SDK. We expect him to get more involved in the development of the SDK as well as helping out on our forums, blogs, and other public pages as time goes on. So if you see a new face helping you out by the handle awood45 on GitHub, or @alexwwood on Twitter, make sure to give him a warm welcome! You might even want to pass on a congratulations or two, as he managed to start a new job and get married, all in the span of one week!

Using the AWS SDK for Ruby from Your REPL

by Loren Segal | on | in Ruby | Permalink | Comments |  Share

We are all used to spinning up irb or Pry sessions to play with Ruby’s features interactively. Some people reading this might even be using the rails console on a daily basis, which can make digging through Ruby on Rails applications much easier. Well, we’re actually working on bringing that same functionality into the AWS SDK for Ruby!

The Backstory

We’ve been using an internal version of our interactive console for a long time in order to more easily develop and debug new features in the Ruby SDK, but it’s always been a very homegrown and customized tool. We didn’t feel that it was in the right state to be published along with the SDK, but we were also looking at extracting this out as a public executable in the aws-sdk gem that we are comfortable supporting.

And then a couple of weeks ago, a developer by the name of Mike Williams (@woollyams on Twitter) posted a Gist that showed how to launch a REPL for the Ruby SDK using Pry, which got us thinking about (and working on) extracting our REPL a little bit more.

Tweet about the Pry REPL (@woollyams)

Thanks, Mike, for indirectly helping to move this forward!

Introducing the REPL

Trevor took the above Gist and did some refactoring to make it work with other services, as well as play more nicely with some of the new convenience features in the Ruby SDK. The end result is now sitting in a branch on the aws/aws-sdk-ruby repository (aws-sdk-repl). You can try the REPL out for yourself by checking out the repository and running ./bin/aws-rb:

$ git clone git://github.com/aws/aws-sdk-ruby
$ cd aws-sdk-ruby
$ git checkout aws-sdk-repl
$ ./bin/aws-rb --help
Usage: aws-rb [options]
        --repl REPL                  specify the repl environment, pry or irb
    -l, --[no-]log                   log client requets, on by default
    -c, --[no-]color                 colorize request logging, on by default
    -d, --[no-]debug                 log HTTP wire traces, off by default
    -Idirectory                      specify $LOAD_PATH directory (may be used more than once)
    -rlibrary                        require the library
    -v, --verbose                    enable client logging and HTTP wire tracing
    -q, --quiet                      disable client logging and HTTP wire tracing
    -h, --help

The Features

Pry by default

The tool currently attempts to use Pry by default, if available, and falls back to a plain old irb session, if it is not. If you want to stick to irb or Pry, pass --repl irb or --repl pry respectively. You can also set this through environment variables (more information on this is discussed in the pull request).

Logging interactively

Also by default, we show simple logging for each request that the SDK sends. This can shed a lot of light onto how you are using the SDK. For example, when you list buckets from S3, you might see:

AWS> s3.buckets.map(&:name)
[AWS S3 200 0.933647 0 retries] list_buckets()  
=> ["mybucket1", "mybucket2", "mybucket3", ...]

You can also show HTTP wire traces by passing -d to the console to run in debug mode. These values can all also be set through environment variables.

Logging existing scripts

Finally, if you have a small script that you want to debug or profile, you can use the aws-rb shell to quickly log all requests and ensure that the right things are happening in that script:

$ cat test.rb
require 'aws-sdk'
AWS.s3.buckets.to_a
AWS.sqs.queues.to_a
$ ./bin/aws-rb -d -I . -r test.rb
<...WIRE TRACE DATA HERE...>
[AWS S3 200 1.183635 0 retries] list_buckets()  
<...WIRE TRACE DATA HERE...>
[AWS SQS 200 0.836059 0 retries] list_queues()  

Looks like we got the requests we were looking for!

Making It Live

We are currently crossing our t’s and dotting our i’s on this new feature, but if you have feedback that you would like to get in on this new REPL, feel free to jump in on the pull request #270 to comment. We’d love to hear anything you have to say, from any large feature omissions all the way down to suggestions for the executable name. Please join in on the conversation at GitHub.

Working with Multiple Regions

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

In a previous blog post, I introduced the new :region configuration option for the AWS SDK for Ruby (aws-sdk gem). Beyond simplified configuration, the Ruby SDK provides additional helpers for working with multiple regions.

There are two new helper classes for working with regions, AWS::Core::Region and AWS::Core::RegionCollection. The AWS module provides helper methods so that you should not need to instantiate these classes directly.

Region Objects

If you know the name of a region you need to work with, you can create it like so:

# no HTTP request required, simply returns a new AWS::Core::Region object
region = AWS.regions['us-west-2']

A region object provides access to service interface objects. Every service can be accessed using its short name.

region = AWS.regions['us-west-2']

# collect the ids of instances running in this region
region.ec2.instances.map(&:id)

# collect the name of tables created in this region
region.dynamo_db.tables.map(&:name)

See the Region class API documentation for a complete list of service interface helper methods.

RegionCollection

Besides returning a single region object, the region collection can also enumerate all public (non GovCloud) regions.

AWS.regions.each do |region|
  puts region.name
end

Please note that when you enumerate regions, an HTTP request is made to get a current list of regions and services. The response is cached for the life of the process.

Enumerating Regions from a Service

Not all services are available in every region. You can safely enumerate only regions a service operates in using a region collection from a service interface. In the following example we use the regions helper method to enumerate what regions Amazon DynamoDB and Amazon Redshift operate in.

AWS::DynamoDB.regions.map(&:name)
#=> ["us-east-1", "us-west-1", "us-west-2", "eu-west-1", "ap-northeast-1", "ap-southeast-1", "ap-southeast-2", "sa-east-1"] 

AWS::Redshift.regions.map(&:name)
#=> ["us-east-1", "us-west-2", "eu-west-1"] 

You can use the region object to operate on a service resource in each region it exists in. As a service expands to additional regions, you code will automatically include those regions when enumerating. In the following example, we list all of the Amazon DynamoDB tables, grouped by region.

# generate a list of DynamoDB tables for every region
AWS::DynamoDB.regions.each do |region|
  table_names = region.dynamo_db.tables.map(&:name)
  unless table_names.empty?
    puts "Region: " + region.name
    puts "Tables:"
    puts table_names.join("n")
    puts ""
  end
end

Take the new regions interfaces for a spin and let us know what you think!

Working with Regions

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

The AWS SDK for Ruby (aws-sdk gem) has some cool new features that simplify working with regions.

The Ruby SDK defaults to the us-east-1 region for all services. Until recently, you had to specify the full regional endpoint for each service you connect to outside the default region. If you use multiple services outside us-east-1, this can be a pain. Your code might end up looking like this:

AWS.config(
  ec2_endpoint: 'ec2.us-west-2.amazonaws.com',
  s3_endpoint: 's3-us-west-2.amazonaws.com',
  # and so on ...
)

Region to the Rescue

You can now set the default region for all services with a single configuration option. Services will construct their own regional endpoint from the default region. If you want to do all of your work in us-west-2, the example above would now look like this:

AWS.config(region: 'us-west-2')

You can pass the :region endpoint directly to a service interface. This is helpful if you need to connect to multiple regional endpoints for a single service.

s3_east = AWS::S3.new(region: 'us-east-1')
s3_west = AWS::S3.new(region: 'us-west-2')

Deprecations

The service specific endpoint options are all now deprecated. They will continue to be supported until removal in our next major revision of the Ruby SDK. The deprecated options are (replace svc with a service prefix like ec2, s3, etc):

  • :svc_endpoint
  • :svc_port
  • :svc_region

Here are a few examples of how to upgrade from the deprecated configuration options to the new options:

# service prefixed connection options are deprecated with AWS.config
AWS.config(s3_endpoint: 'localhost', s3_port: 8000)
s3 = AWS::S3::Client.new

# service prefixed connection options are deprecated with clients
s3 = AWS::S3::Client.new(s3_endpoint: 'localhost', s3_port: 8000)

# this is the preferred method for setting endpoint and port
s3 = AWS::S3::Client.new(endpoint: 'localhost', port: 8000)

Threading with the AWS SDK for Ruby

by Loren Segal | on | in Ruby | Permalink | Comments |  Share

When using threads in an application, it’s important to keep thread-safety in mind. This statement is not specific to the Ruby world; it’s a reality in any language that supports threading. What is specific to Ruby is the fact that many libraries in our language are loaded at run-time, and often, loading code at run-time is not a thread-safe operation.

Autoload and Thread-Safety

Many libraries and frameworks (including Ruby on Rails) use a feature of Ruby known as autoload, which allows components of a library to be lazily loaded only when the constant is resolved in the code path of an executing program. The problem with this feature is that, historically, the implementation has not been thread-safe. In other words, if two threads tried to resolve an autoloaded constant at the same time, weird things would happen. This problem was finally tackled in Ruby 1.9.1 but then regressed in 1.9.2 and re-resolved in 1.9.3 (but only in a later patchlevel), causing a bit of confusion around whether autoload is actually safe to use in a threaded Ruby program.

Thread-Safe in 2.0

For all intents and purposes, autoloading of modules should be considered thread-safe in Ruby 2.0.0p0, as the patch was officially merged into the 2.0 branch prior to release. Any thread-safety issues in Ruby 2.0 should be considered regressions, according to that ticket.

Enter Eager Loading

Of course, guaranteeing support for Ruby 2.0 is not entirely sufficient for most programs still running on 1.9.x, and in some cases, 1.8.x, so you may need to use a more backward-compatible strategy. In Ruby on Rails, this was solved with an eager_autoload method that forcibly loads all modules marked to be lazily loaded. If you are running threaded code, it is recommended that you call this prior to launching threads. Note that in Rails 4.0, the framework will eager load all modules by default, which should help you avoid having to think about these threading issues.

Eager Autoloading in AWS SDK for Ruby

So is this an issue for the AWS SDK for Ruby? In short, if you are using a version prior to Ruby 2.0, the answer is "most likely". The SDK is large enough that lazily loading extra modules is important to keep library load time as fast as possible. The downside of this approach is that it can cause issues in multi-threaded programs.

To solve the problem in the SDK, we use a similar mechanism to Ruby on Rails and created an AWS.eager_autoload! method that requires all modules in the library up front. To use this method, simply call it before you launch any threads:

require 'aws-sdk'

AWS.eager_autoload!

# Now you can start threading
Thread.new do ... end

Focused Eager Loading

Sometimes, loading all of the SDK is unnecessary and slow. Fortunately, as of version 1.9.0 of the Ruby SDK, the AWS.eager_autoload! method now optionally accepts the name of a module to load instead of requiring you to eager load the entire SDK. This means that if you are only using a specific service, or a set of services, like Amazon S3 and Amazon DynamoDB, you can choose to eager load only these modules. This can help to improve load time of your application, especially if you do not need many of the other modules packaged in the SDK. To load a focused set of modules, simply call the eager autoload method with the names of the modules you want to load along with AWS::Core:

AWS.eager_autoload! AWS::Core     # Make sure to load Core first.
AWS.eager_autoload! AWS::S3       # Load the S3 class
AWS.eager_autoload! AWS::DynamoDB # Load the DynamoDB class

# Now you can start threading
Thread.new do ... end

Wrapping Up This Thread

The AWS SDK for Ruby has an AWS.eager_autoload! method that allows you to forcibly load all components in the library up front. If you are writing multi-threaded code in Ruby, you will most likely want to call this method before launching any threads that make use of the SDK in order to avoid any thread-safety issues with autoload in older versions of Ruby. Fortunately, it is very easy to use by adding a single method call to the top of your application. It is also easy to target specific modules to eager load by passing the module name to the method, if load-time performance is important to your library or application.

AWS at RailsConf 2013

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

Loren and I will be at RailsConf next week. AWS will have a booth on the exhibitor floor. If you have any questions about the AWS SDK for Ruby (or anything really), we’d love to chat. We will have swag and credits to hand out, so come stop by and say hi.

I will also be giving a talk Wednesay morning about using a model do describe your web service. The technical parts of the talk are extracted from some of the cool work we are doing at AWS. If you don’t have a chance to come by the booth, you can also catch us after the talk.

See you in Portland!