Category: Ruby


From RSpec to Minitest

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

One of my favorite aspects of working with Ruby is how natural it is to write tests for. The Ruby community does an excellent job of encouraging authors to produce well tested code. There is a plethora of well supported tools to choose from. I like to joke that new Ruby developers write Micro test frameworks instead of "Hello World!".

Much of the Ruby code I have maintained uses RSpec. Lately I have been spending some time with Minitest. You may have worked with Minitest — it ships as part of the Ruby standard library.

Why bother learning another testing framework when your current tool suits your needs? My answer is why not? It’s always good to expand your horizons. I have found learning a new testing tool expands my ability to write good tests. I pick up new patterns and the context switch forces to me to question my standard testing approaches. As a result, I tend to write better tests.

Minitest::Spec

Minitest::Spec does a great job of bridging the gap between RSpec-style specifications and Minitest-style unit tests.

Here is an example RSpec test file:

require 'spec_helper'

describe MyClass do
  describe '#some_method' do
    it 'returns a a string' do
      MyClass.new.should be_a(String)
    end
  end
end

And the same thing using Minitest::Spec:

require 'test_helper'

describe MyClass do
  describe '#some_method' do
    it 'returns a string' do
      MyClass.new.some_method.must_be_kind_of(String)
    end
  end
end

Matchers

The primary difference above is how you make assertions. RSpec-style should matchers can be converted to Minitest expectations with ease. The table below gives a few examples.

RSpec Matcher Minitest Matcher
obj.should be(value) obj.must_be(value)
obj.should be_empty obj.must_be_empty
obj.should be(nil) obj.must_be_nil
obj.should equal(value) obj.must_equal(value)
lambda { … }.should raise_error(ErrorClass) lambda { … }.must_raise(ErrorClass)

See the Minitest api documentation for more expectations.

Mocks

Mocks (and stubs) are where the two testing libraries differ the most. RSpec provides doubles; Minitest provides a Mock class.

@mock = Minitest::Mock.new

You can set expectations about what messages the mock should receive. You name the method to expect, what to return from the mock and then optionally the arguments the method should receive. Given I need a mock for a user, and I expect the user’s delete method will be called, I could do the following:

user = Minitest::Mock.new
user.expect(:delete, true) # returns true, expects no args

UserDestoyer.new.delete_user(user)

assert user.verify

Calling #verify is necessary for the mock to enforce the expectations. RSpec makes this a little easier, but its not a huge adjustment.

Stubs

Stubs are pretty straight forward. Unlike RSpec, the stub lasts only till the end of the block. You also cannot stub methods that don’t exist yet.

Time.stub :now, Time.at(0) do
  assert obj_under_test.stale?
end

Summary

Minitest works well and I’m impressed by how fast it run tests. Coming from RSpec you may find yourself missing features. Instead of trying to find exact replacements, consider using plain old Ruby solutions instead. You may find you write better tests. I still enjoy working with RSpec and appreciate it for its strengths, but you should also consider giving Minitest a spin.

Efficient Amazon S3 Object Concatenation Using the AWS SDK for Ruby

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

Today’s post is from one of our Solutions Architects: Jonathan Desrocher, who coincidentally is also a huge fan of the AWS SDK for Ruby.


There are certain situations where we would like to take a dataset that is spread across numerous Amazon Simple Storage Service (Amazon S3) objects and represent it as a new object that is the concatenation of those S3 objects. A real-life example might be combining individual hourly log files from different servers into a single environment-wide concatenation for easier indexing and archival. Another use case would be concatenating outputs from multiple Elastic MapReduce reducers into a single task summary.

While it is possible to download and re-upload the data to S3 through an EC2 instance, a more efficient approach would be to instruct S3 to make an internal copy using the new copy_part API operation that was introduced into the SDK for Ruby in version 1.10.0.

Why upload when you can copy?

Typically, new S3 objects are created by uploading data from a client using AWS::S3::S3Object#write method or by copying the contents of an existing S3 object using the AWS::S3::Object#copy_to method of the Ruby SDK.

While the copy operation offers the advantage of offloading data transfer from the client to the S3 back-end, it is limited by its ability to only produce new objects with the exact same data as the data specified in the original. This limits the usefulness of the copy operation to those occasions where we want to preserve the data but change the object’s properties (such as key-name or storage class) as S3 objects are immutable.

In our case, we want to offload the heavy lifting of the data transfer to S3’s copy functionality, but at the same time, we need to be able to shuffle different source objects’ contents into a single target derivative—and that brings us to the Multipart Upload functionality.

Copying into a Multipart Upload

Amazon S3 offers a Multipart Upload feature that enables customers to create a new object in parts and then combine those parts into a single, coherent object.

By its own right, Multipart Upload enables us to efficiently upload large amounts of data and/or deal with an unreliable network connection (which is often the case with mobile devices) as the individual upload parts can be retried individually (thus reducing the volume of data retransmissions). Just as importantly, the individual upload parts can be uploaded in parallel, which can greatly increase the aggregated throughput of the upload (note that the same benefits also apply when using byte range GETs).

Multipart Upload can be combined with the copy functionality through the Ruby SDK’s AWS::S3::MultipartUpload#copy_part method—which results in the internal copy of the specified source object into an upload part of the Multipart Upload.

Upon the completion of the Multipart Upload job the different upload parts are combined together such that the last byte of an upload part will be immediately followed by the first byte of the subsequent part (which could be the target of a copy operation itself)— resulting in a true in-order concatenation of the specified source objects.

Code Sample

Note that this example uses Amazon EC2 roles for authenticating to S3. For more information about this feature, see our “credential management” post series.


require 'rubygems'
require 'aws-sdk'

s3 = AWS::S3.new()
mybucket = s3.buckets['my-multipart']

# First, let's start the Multipart Upload
obj_aggregate = mybucket.objects['aggregate'].multipart_upload

# Then we will copy into the Multipart Upload all of the objects in a certain S3 directory.
mybucket.objects.with_prefix('parts/').each do |source_object|

  # Skip the directory object
  unless (source_object.key == 'parts/')
    # Note that this section is thread-safe and could greatly benefit from parallel execution.
    obj_aggregate.copy_part(source_object.bucket.name + '/' + source_object.key)
  end

end

obj_completed = obj_aggregate.complete()

# Generate a signed URL to enable a trusted browser to access the new object without authenticating.
puts obj_completed.url_for(:read)

Last Notes

  • The AWS::S3::MultipartUpload#copy_part method has an optional parameter called :part_number. Omitting this parameter (as in the example above) is thread-safe. However, if multiple processes are participating in the same Multipart Upload (as in different Ruby interpreters on the same machine or different machines altogether), then the part number must be explicitly provided in order to avoid sequence collisions.
  • With the exception of the last part, there is a 5 MB minimum part size.
  • The completed Multipart Upload object is limited to a 5 TB maximum size.
  • It is possible to mix-and-match between upload parts that are copies of existing S3 objects and upload parts that are actually uploaded from the client.
  • For more information on S3 multipart upload and other cool S3 features, see the “STG303 Building scalable applications on S3” session from AWS re:Invent 2012.

Happy concatenating!

AWS re:Invent 2013

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

AWS re:Invent is this week (November 12th-15th) in Las Vegas! We are excited to be here now, and to have an opportunity to talk to you in person.

There is going to be a lot of great technical content year. Loren Segal and I will be presenting a session on Thursday called Diving Into the New AWS SDK for Ruby. We will also be hanging out in the developer lounge area, so come by any time, especially during our Ruby development office hours.

See you there!

AWS SDK for Ruby Core Developer Preview

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

A few months ago, Loren blogged about the upcoming version 2 of the AWS SDK for Ruby. Shortly after that, we published our work-in-progress code to GitHub as aws/aws-sdk-core-ruby. I am happy to announce that AWS SDK Core has stabilized enough to enter a developer preview period. It currently supports 30+ services.

To install AWS SDK Core from Rubygems:

gem install aws-sdk-core --pre

Or with Bundler:

gem 'aws-sdk-core'

What is AWS SDK Core?

Version 2 of the Ruby SDK will separate the higher level service abstractions from the service clients. We are focusing our initial efforts to ensure the new client libraries are robust, extensible, and more capable than those in version 1. We are still exploring how best to migrate higher level abstractions from version 1 into version 2.

AWS SDK Core uses a different namespace. This allows you to install both aws-sdk and aws-sdk-core in the same application.

Versioning Strategy

AWS SDK Core is being released as version ‘2.0.0-rc.1’. This shows our intention that core will be the successor to the current Ruby SDK.

Links of Interest

AWS SDK for Ruby and Nokogiri

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

In two weeks, on November 19, 2013, we will be removing the upper bound from the nokogiri gem dependency in version 1 of the AWS SDK for Ruby. We’ve been discussing this change with users for a while on GitHub.

Why Is There Currently an Upper Bound?

Nokogiri removed support for Ruby 1.8 with the release of version 1.6. The Ruby SDK requires Nokogiri and supports Ruby 1.8.7+. The current restriction ensures that Ruby 1.8 users can install the aws-sdk gem.

Why Remove the Upper Bound?

Users of the Ruby SDK that use Ruby 1.9+ have been requesting the upper bound removed. Some want to access features of Nokogiri 1.6+, while others are having headaches with dependency management when multiple libraries require Nokogiri and the versions are exclusive.

The Ruby SDK has been tested against Nokogiri 1.4+. By removing this restriction, we allow end users to choose the version of Nokogiri that best suits their needs. Libraries that have narrow dependencies can make managing co-dependencies difficult. We want to help remove some of that headache.

Will it Still be Possible to Install the Ruby SDK in Ruby 1.8.7?

Yes, it will still be possible to install the Ruby SDK in Ruby 1.8.7 when the restriction is removed. If your target environment already has Nokogiri < 1.6 installed, then you don’t need to do anything. Otherwise, you will need to install Nokogiri before installing the aws-sdk gem.

If you are using bundler to install the aws-sdk gem, add an entry for Nokogiri:

gem 'aws-sdk'
gem 'nokogiri', '< 1.6'

If you are using the gem command to install the Ruby SDK, you must ensure that a compatible version of Nokogiri is present in Ruby 1.8.7.

gem install nokogiri --version="<1.6"
gem install aws-sdk

You should not need to make any changes if any of the following are true:

  • You have a Gemfile.lock deployed with your application
  • Nokogiri < 1.6 is already installed in your target environment
  • Your deployed environment does not reinstall gems when restarted

What About Version 2 of the Ruby SDK?

Version 2 of the Ruby SDK no longer relies directly on Nokogiri. Instead it has dependencies on multi_xml and multi_json. This allows you to use Ox and Oj for speed, or Nokogiri if you prefer. Additionally, the Ruby SDK will work with pure Ruby XML and JSON parsing libraries which makes distributing to varied environments much simpler.

Keep the Feedback Coming

We appreciate all of the users who have chimed in on this issue and shared their feedback. User feedback definitely helps guide development of the Ruby SDK. If you haven’t had a chance to check out our work on v2 of the Ruby SDK, please take a look at the GitHub repo for AWS Ruby Core.

Thanks!

AWS SDK Core Response Structures

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

I blogged recently about how the code is now available for AWS SDK Core. This new repository is the basis for what will become version 2 of the AWS SDK for Ruby.

We have not cut a public gem for AWS SDK Core yet. Instead, we have published the work-in-progress code to GitHub for the community to see. We hope to get a lot of feedback on features as they are developed. In an effort to engage the community, I will be blogging about some of these new features and soliciting feedback. Our hope is to improve the overall quality of version 2 of the Ruby SDK through this process.

Today, I will talking about the new response structures.

V1 Response Strutures

In version 1 of the Ruby SDK, the low-level clients accepted a hash of request parameters, and then returned response data as a hash. Here is a quick example:

# hash in, hash out
response = AWS::S3::Client.new.list_buckets(limit: 2)
pp response.data

{:buckets=>[
  {:name=>"aws-sdk", :creation_date=>"2012-03-19T16:37:04.000Z"},
  {:name=>"aws-sdk-2", :creation_date=>"2013-09-27T16:17:02.000Z"}],
 :owner=>
  {:id=>"...",
   :display_name=>"..."}}

This approach is simple and flexible. However, it gives little guidance when exploring a response. Here are some issues that arise from using hashes:

  • Attempts to access unset response keys return a nil value. There is no way to tell if the service omitted the value or if the hash key contains a typo.

  • Operating on nested values is a bit awkward. To collect bucket names, a user would need to use blocks to access attributes:

    data[:buckets].map{ |b| b[:name] }
  • The response structure gives no information about what other attributes the described resource might also have, only what is present currently.

V2 Response Structures

In AWS SDK Core, we take a different approach. We use descriptions about the complete response structure to construct Ruby Struct objects. Here is the sample from above with version 2:

Aws::S3.new.list_buckets.data
#=> #<struct 
 buckets=
  [#<struct name="aws-sdk", creation_date=2012-03-19 16:37:04 UTC>,
   #<struct name="aws-sdk-2", creation_date=2013-09-27 16:17:02 UTC>],
 owner=
  #<struct 
   id="...",
   display_name="...">>

Struct objects provide the following benefits:

  • Indifferent access with strings, symbols, and methods:

    data.buckets.first.name
    data[:buckets].first[:name]
    data['buckets'].first['name']
    
  • Operating on nested values possible using Symbol-to-Proc semantics:

    data.buckets.map(&:name)
    
  • Accessing an invalid property raises an error:

    data.buckets.first.color
    #=> raises NoMethodError: undefined method `color' for #<struct ...>
    

Feedback

What do you think about the new response structures? Take a moment to checkout the new code and give it a spin. We would love to hear your feedback. Issues and feature requests are welcome. Come join us on GitHub.

Introducing AWS SDK Core

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

We’ve been working hard on version 2 of the AWS SDK for Ruby. Loren blogged about some of our upcoming plans for version 2. I’m excited to pull back the curtains and show off the work we’ve done on version 2 of the Ruby SDK.

AWS SDK Core

The AWS SDK Core library will provide a single client for each of the Amazon Web Services we support. Our initial goal is reach feature parity between AWS SDK Core clients and Ruby SDK version 1 clients. We have made good progress, but there are some missing features, e.g., retry logic, logging, API reference docs, etc.

We are currently evaluating how to provide higher level abstractions like those found in version 1. This is not our current focus.

You can learn more about how AWS SDK Core from the project README.

A New Namespace

If you dig around the AWS SDK Ruby Core source code on GitHub you may notice we have changed namespaces from AWS:: to Aws::. We want to make it possible for users to install versions 1 and 2 of the Ruby SDK in the same project. This will make it much easier to try out the new code and to upgrade at will.

Whats Next?

After client parity, there are a lot of things on our todo list. We are releasing the code now so that we can solicit your feedback. Your feedback helps us pick our priorities. Check it out and drop us a note!

DynamoDB Session Store for Rack Applications

by Loren Segal | on | in Ruby | Permalink | Comments |  Share

Today we are announcing a new RubyGem that enables your Ruby on Rails or Rack-based applications to store session data inside of Amazon DynamoDB. The gem acts as a drop-in replacement for session stores inside of Rails and can also run as a Rack middleware for non-Rails apps. You can read more about how to install and configure the gem on the GitHub repository: aws/aws-sessionstore-dynamodb-ruby. If you want to get started right away, just add the gem to your Gemfile via:

gem 'aws-sessionstore-dynamodb', '~> 1.0'

For me, the best part of this gem is that it was the product of a summer internship project by one of our interns, Ruby Robinson. She did a great job ramping up on new skills and technologies, and ultimately managed to produce some super well-tested and idiomatic code in a very short period of time. Here’s Ruby in her own words:

Hello, my name is Ruby Robinson, and I was a summer intern with the AWS Ruby SDK team. My project was to create a RubyGem (aws-sessionstore-dynamodb) that allowed Rack applications to store sessions in Amazon DynamoDB.

I came into the internship knowing Java and, ironically, not knowing Ruby. It was an opportunity to learn something new and contribute to the community. After pouring myself through a series of books, tutorials, and blogs on Ruby, Ruby on Rails, and Rack, the gem emerged; with help from Loren and Trevor.

Along with creating the gem, I got to experience the Amazon engineering culture. It largely involves taking ownership of projects, innovation, and scalability. I got to meet with engineers who were solving problems at scales I had only heard of. With an Amazon internship, you are not told what to do; you are asked what you are going to do. As my technical knowledge grew, I was able to take ownership of my project and drive it to completion.

In the end I produced a gem with some cool features! The gem is a drop-in replacement for the default session store that gives you the persistence and scale of Amazon DynamoDB. So, what are you waiting for? Check out the gem today!

The experience of bringing a developer from another language into Ruby taught me quite a bit about all of the great things that our ecosystem provides us, and also shined a light on some of the things that are more confusing to newbies. In the end, it was extremely rewarding to watch someone become productive in the language in such a short period of time. I would recommend that everyone take the opportunity to teach new Rubyists the language, if that opportunity ever arises. I think it’s also important that we encourage new developers to become active in the community and write more open source code, since that’s what makes our ecosystem so strong. So, if you know of a new Rubyist in your area, invite them out to your local Ruby meetup or hackfest and encourage them to contribute to some of the projects. You never know, in a few years these might be the people writing and maintaining the library code you depend on every day.

And with that said, please check out our new Amazon DynamoDB Session Store for Rack applications and let us know what you think, either here, or on GitHub!

AWS SDK for Ruby v1.15.0

by Alex Wood | on | in Ruby | Permalink | Comments |  Share

Yesterday morning, we released a new version of the AWS SDK for Ruby (aws-sdk gem). This release adds mobile push support for Amazon Simple Notification Service. The release also includes API updates for Amazon Redshift, adding snapshot identifiers to the AWS::Redshift::Client#copy_cluster_snapshot and AWS::Redshift::Client#delete_cluster_snapshot operations, and enabling better status reporting for restoring from snapshots.

You can view the release notes here.

AWS SDK for Ruby v1.14.0

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

We just published v1.14.0 of the AWS SDK for Ruby (aws-sdk gem).  This release updates the SDK to support custom Amazon Machine Images (AMIs) and Chef 11 for AWS OpsWorks. Also updates Amazon Simple Workflow Service and Amazon Simple Notifications Service to latest API versions.

You can view the release notes here.