Tag: S3


Using Client-Side Encryption for S3 in the AWS SDK for Ruby

by Alex Wood | on | in Ruby | Permalink | Comments |  Share

What is client-side encryption, and why might I want to use it?

If you wish to store sensitive data in Amazon S3 with the AWS SDK for Ruby, you have several ways of managing the safety and security of the data. One good practice is to use HTTPS whenever possible to protect your data in transit. Another is to use S3’s built in server-side encryption to protect your data at rest. In this post, we highlight yet another option, client-side encryption.

Client-side encryption is a little more involved than server-side encryption, since you bring your own security credentials. But, it has the added benefit that data never exists in an unencrypted state outside of your execution environment.

How do I use client-side encryption with the AWS SDK for Ruby?

The SDK for Ruby does most of the heavy lifting for you when using client-side encryption for your S3 objects. When performing read and write operations on S3, you can specify various options in an option hash passed in to the S3Object#write and S3Object#read methods.

One of these options is :encryption_key, which accepts either an RSA key (for asymmetric encryption), or a string (for symmetric encryption). The SDK for Ruby then uses your key to encrypt an auto-generated AES key, which is used to encrypt and decrypt the payload of your message. The encrypted form of your auto-generated key is stored with the headers of your object in S3.

Here is a short example you can try to experiment with client-side encryption yourself:

require 'aws-sdk'
require 'openssl'

# Set your own bucket/key/data values
bucket = 's3-bucket'
key = 's3-object-key'
data = 'secret message'

# Creates a string key - store this!
symmetric_key = OpenSSL::Cipher::AES256.new(:CBC).random_key

options = { :encryption_key => symmetric_key }
s3_object = AWS.s3.buckets[bucket].objects[key]

# Writing an encrypted object to S3
s3_object.write(data, options)

# Reading the object from S3 and decrypting
puts s3_object.read(options)

There are a couple practical matters you should consider. One is that if you lose the key used to encrypt the object, you will be unable to decrypt your contents. You should securely store your key (e.g., as a file or using a separate key management system) and load it when needed for writing or reading objects. Additionally, encryption and decryption on your objects does bring with it some performance overhead, so you should use it only when needed (this overhead varies depending on the size and type of key used).

You can read more about the encryption choices available to you with the AWS SDK for Ruby in our API documentation. You can also read more about general best practices for security in AWS by following the AWS Security Blog. As you consider the choices available for securing your data, we hope you find them effective and simple to use.

Amazon S3 PHP Stream Wrapper

by Michael Dowling | on | in PHP | Permalink | Comments |  Share

As of the 2.3.0 release, the AWS SDK for PHP now provides an official Amazon S3 PHP stream wrapper. The stream wrapper allows you to treat Amazon S3 like a filesystem using functions like fopen(), file_get_contents(), and filesize() through a custom stream wrapper protocol. The Amazon S3 stream wrapper opens up some interesting possibilities that were either previously impossible or difficult to implement.

Registering the stream wrapper

Before you can use the Amazon S3 stream wrapper, you must register it with PHP:

use AwsS3S3Client;

// Create an Amazon S3 client object
$client = S3Client::factory(array(
    'key'    => '[aws access key]',
    'secret' => '[aws secret key]'
));

// Register the stream wrapper from a client object
$client->registerStreamWrapper();

After registering the stream wrapper, you can use various PHP filesystem functions that support custom stream wrapper protocols.

$bucket = 'my_bucket';
$key = 'object_key';

// Get the contents of an object as a string
$contents = file_get_contents("s3://{$bucket}/{$key}");

// Get the size of an object
$size = filesize("s3://{$bucket}/{$key}");

Stream wrappers in PHP are identified by a unique protocol; the Amazon S3 stream wrapper uses the "s3://" protocol. Amazon S3 stream wrapper URIs always start with the "s3://" protocol followed by an optional bucket name, forward slash, and optional object key: s3://bucket/key.

Streaming downloads

The Amazon S3 stream wrapper allows you to truly stream downloads from Amazon S3 using functions like fopen(), fread(), and fclose(). This allows you to read bytes off of a stream as needed rather than downloading an entire stream upfront and then working with the data.

The following example opens a read-only stream, read up to 1024 bytes from the stream, and closes the stream when no more data can be read from it.

// Open a stream in read-only mode
if (!($stream = fopen("s3://{$bucket}/{$key}", 'r'))) {
    die('Could not open stream for reading');
}

// Check if the stream has more data to read
while (!feof($stream)) {
    // Read 1024 bytes from the stream
    echo fread($stream, 1024);
}
// Be sure to close the stream resource when you're done with it
fclose($stream);

Seekable streams

Because no data is buffered in memory, read-only streams with the Amazon S3 stream wrapper are by default not seekable. You can force the stream to allow seeking using the seekable stream context option.

// Create a stream context to allow seeking
$context = stream_context_create(array(
    's3' => array(
        'seekable' => true
    )
));

if ($stream = fopen('s3://bucket/key', 'r', false, $context)) {
    // Read bytes from the stream
    fread($stream, 1024);
    // Seek back to the beginning of the stream
    fseek($stream, 0);
    // Read the same bytes that were previously read
    fread($stream, 1024);
    fclose($stream);
}

Opening seekable streams allows you to seek only to bytes that have been previously read. You cannot skip ahead to bytes that have not yet been read from the remote server. In order to allow previously read data to be recalled, data is buffered in a PHP temp stream using Guzzle’s CachingEntityBody decorator.

Streaming uploads from downloads

You can use an Amazon S3 stream resource with other AWS SDK for PHP operations. For example, you could stream the contents of one Amazon S3 object to a new Amazon S3 object.

$stream = fopen("s3://{$bucket}/{$key}", 'r');

if (!$stream) {
    die('Unable to open stream for reading');
}

$client->putObject(array(
    'Bucket' => 'other_bucket',
    'Key'    => $key,
    'Body'   => $stream
));

fclose($stream);

Uploading data

In addition to downloading data with the stream wrapper, you can use the stream wrapper to upload data as well.

$stream = fopen("s3://{$bucket}/{$key}", 'w');
fwrite($stream, 'Hello!');
fclose($stream);

Note: Because Amazon S3 requires a Content-Length for all entity-enclosing HTTP requests, the contents of an upload must be buffered using a PHP temp stream before it is sent over the wire.

Traversing buckets

You can modify and browse Amazon S3 buckets similar to how PHP allows the modification and traversal of directories on your filesystem.

Here’s an example of creating a bucket:

mkdir('s3://bucket');

You can delete empty buckets using the rmdir() function.

rmdir('s3://bucket');

The opendir(), readdir(), rewinddir(), and closedir() PHP functions can be used with the Amazon S3 stream wrapper to traverse the contents of a bucket.

$dir = "s3://bucket/";

if (is_dir($dir) && ($dh = opendir($dir))) {
    while (($file = readdir($dh)) !== false) {
        echo "filename: {$file} : filetype: " . filetype($dir . $file) . "n";
    }
    closedir($dh);
}

You can recursively list each object and prefix in a bucket using PHP’s RecursiveDirectoryIterator.

$dir = 's3://bucket';
$iterator = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($dir));

foreach ($iterator as $file) {
    echo $file->getType() . ': ' . $file . "n";
}

Using the Symfony2 Finder component

The easiest way to traverse an Amazon S3 bucket using the Amazon S3 stream wrapper is through the Symfony2 Finder component. The Finder component allows you to more easily filter the files that the stream wrapper returns.

require 'vendor/autoload.php';

use SymfonyComponentFinderFinder;

$finder = new Finder();

// Get all files and folders (key prefixes) from "bucket" that are less than
// 100K and have been updated in the last year
$finder->in('s3://bucket')
    ->size('< 100K')
    ->date('since 1 year ago');

foreach ($finder as $file) {
    echo $file->getType() . ": {$file}n";
}

You will need to install the Symfony2 Finder component and add it to your project’s autoloader in order to use it with the AWS SDK for PHP. The most common way to do this is to add the Finder component to your project’s composer.json file. You can find out more about this process in the Composer documentation.

More information

We hope you find the new Amazon S3 stream wrapper useful. You can find more information and documentation on the Amazon S3 stream wrapper in the AWS SDK for PHP User Guide.

Syncing Data with Amazon S3

by Michael Dowling | on | in PHP | Permalink | Comments |  Share

Warning: this blog post provides instructions for AWS SDK for PHP V2, if you are looking for AWS SDK for PHP V3 instructions, please see our SDK guide.

Have you ever needed to upload an entire directory of files to Amazon S3 or download an Amazon S3 bucket to a local directory? With a recent release of the AWS SDK for PHP, this is now not only possible, but really simple.

Uploading a directory to a bucket

First, let’s create a client object that we will use in each example.

use AwsS3S3Client;

$client = S3Client::factory(array(
    'key'    => 'your-aws-access-key-id',
    'secret' => 'your-aws-secret-access-key'
));

After creating a client, you can upload a local directory to an Amazon S3 bucket using the uploadDirectory() method of a client:

$client->uploadDirectory('/local/directory', 'my-bucket');

This small bit of code compares the contents of the local directory to the contents in the Amazon S3 bucket and only transfer files that have changed. While iterating over the keys in the bucket and comparing against the names of local files, the changed files are uploaded in parallel using batches of requests. When the size of a file exceeds a customizable multipart_upload_size option, the uploader automatically uploads the file using a multipart upload.

Customizing the upload sync

Plenty of options and customizations exist to make the uploadDirectory() method flexible so that it can fit many different use cases and requirements.

The following example uploads a local directory where each object is stored in the bucket using a public-read ACL, 20 requests are sent in parallel, and debug information is printed to standard output as each request is transferred.

$dir = '/local/directory';
$bucket = 'my-bucket';
$keyPrefix = '';
$options = array(
    'params'      => array('ACL' => 'public-read'),
    'concurrency' => 20,
    'debug'       => true
);

$client->uploadDirectory($dir, $bucket, $keyPrefix, $options);

By specifying $keyPrefix, you can cause the uploaded objects to be placed under a virtual folder in the Amazon S3 bucket. For example, if the $bucket name is “my-bucket” and the $keyPrefix is “testing/”, then your files will be uploaded to “my-bucket” under the “testing/” virtual folder: https://my-bucket.s3.amazonaws.com/testing/filename.txt.

You can find more documentation about uploading a directory to a bucket in the AWS SDK for PHP User Guide.

Downloading a bucket

Downloading an Amazon S3 bucket to a local directory is just as easy. We’ll again use a simple function available on an AwsS3S3Client object to easily download objects: downloadBucket().

The following example downloads all of the objects from my-bucket and stores them in /local/directory. Object keys that are under virtual subfolders are converted into a nested directory structure when the objects are downloaded.

$client->downloadBucket('/local/directory', 'my-bucket');

Customizing the download sync

Similar to the uploadDirectory() method, the downloadBucket() method has several options that can customize how files are downloaded.

The following example downloads a bucket to a local directory by downloading 20 objects in parallel and prints debug information to standard output as each transfer takes place.

$dir = '/local/directory';
$bucket = 'my-bucket';
$keyPrefix = '';

$client->downloadBucket($dir, $bucket, $keyPrefix, array(
    'concurrency' => 20,
    'debug'       => true
));

By specifying $keyPrefix, you can limit the downloaded objects to only keys that begin with the specified $keyPrefix. This can be useful for downloading objects under a virtual directory.

The downloadBucket() method also accepts an optional associative array of $options that can be used to further control the transfer. One option of note is the allow_resumable option, which allows the transfer to resume any previously interrupted downloads. This can be useful for resuming the download of a very large object so that you only need to download any remaining bytes.

You can find more documentation on syncing buckets and directories and other great Amazon S3 abstraction layers in the AWS SDK for PHP User Guide.

AWS at Symfony Live Portland 2013

by Jeremy Lindblom | on | in PHP | Permalink | Comments |  Share

A few weeks ago, I had the pleasure of attending the Symfony Live Portland 2013 conference. This year, Symfony Live co-located with the very large DrupalCon, and though I did not attend any of the DrupalCon sessions, I did get to talk to many Drupal developers during lunches and the hack day. It was awesome to be among so many other PHP developers.

I had the honor of being selected as a speaker at Symfony Live, and the topic of my session was Getting Good with the AWS SDK for PHP (here are the slides and Joind.in event). In this talk I did a brief introduction about AWS and its services, taught how to use the AWS SDK for PHP, and demonstrated some code from a sample PHP application that uses Amazon S3 and Amazon DynamoDB to manage its data.

How does the SDK integrate with Symfony?

Since I was in the presence of Symfony developers, I made sure to point out some of the ways that the AWS SDK for PHP currently integrates with the Symfony framework and community.

The SDK uses the Symfony Event Dispatcher

The SDK uses the Symfony Event Dispatcher component quite heavily. Not only are many of the internal details of the SDK implemented with events (e.g., request signing), but users of the SDK can listen for events and inject their own logic into the request flow.

For example, the following code attaches an event listener to an SQS client that will capitalize messages sent to a queue via the SendMessage operation.

use AwsCommonAws;
use GuzzleCommonEvent; // Extends SymfonyComponentEventDispatcherEvent

$aws = Aws::factory('/path/to/your/config.php');
$sqs = $aws->get('sqs');

$dispatcher = $sqs->getEventDispatcher();
$dispatcher->addListener('command.before_send', function (Event $event) {
    $command = $event['command'];
    if ($command->getName() === 'SendMessage') {
        // Ensure the message is capitalized
        $command['MessageBody'] = ucfirst($command['MessageBody']);
    }
});

$sqs->sendMessage(array(
    'QueueUrl'    => $queueUrl,
    'MessageBody' => 'an awesome message.',
));

We publish an AWS Service Provider for Silex

For Silex users, we publish an AWS Service Provider for Silex that makes it easier to bootstrap the AWS SDK for PHP within a Silex application. I used this service provider in my presentation with the sample PHP application, so make sure to check out my slides.

You can use the Symfony Finder with Amazon S3

In my presentation, I also pointed out our recent addition of the S3 Stream Wrapper to our SDK and how you can use it in tandem with the Symfony Finder component to find files within your Amazon S3 buckets.

The following example shows how you can use the Symfony Finder to find S3 objects in the bucket "jcl-files", with a key prefix of "family-videos", that are smaller than 50 MB in size and no more than a year old.

use AwsCommonAws;
use SymfonyComponentFinderFinder;

$aws = Aws::factory('/path/to/your/config.php');
$aws->get('s3')->registerStreamWrapper();

$finder = new Finder();
$finder->files()
    ->in('s3://jcl-files/family-videos')
    ->size('< 50M')
    ->date('since 1 year ago');

foreach ($finder as $file) {
    echo $file->getFilename() . PHP_EOL;
}

Others talked about AWS

One of my co-workers, Michael Dowling, also presented at the conference. His presentation was about his open source project, Guzzle, which is a powerful HTTP client library and is used as the foundation of the AWS SDK for PHP. In his talk, Michael also highlighted a few of the ways that the AWS SDK for PHP uses Guzzle. Guzzle is also being used in the core of Drupal 8, so his presentation drew in a crowd of both Drupal and Symfony developers.

Aside from our presentations, there were various sessions focused on the Symfony framework as well as others on various topics like Composer, caching, and cryptograpy. David Zuelke and Juozas Kaziukėnas both mentioned how they use AWS services in their talks: Surviving a Prime Time TV Commercial and Process any amount of data. Any time, respectively. It was nice to meet in person many PHP developers I’ve talked with online and to participate in Symfony Live traditions such as PHP Jeopardy and karaoke.

While at the conference, I talked to several developers about what would make a good AWS Symfony bundle or Drupal module, but I’m also curious to find out what you think. So… what would you like to see in an AWS Symfony Bundle? What would make a good AWS Drupal module? Let us know your thoughts in the comments.

Transferring Files To and From Amazon S3

by Jeremy Lindblom | on | in PHP | Permalink | Comments |  Share

A common question that I’ve seen on our PHP forums is whether there is an easy way to directly upload from or download to a local file using the Amazon S3 client in the AWS SDK for PHP.

The typical usage of the PutObject operation in the PHP SDK looks like the following:

use AwsCommonAws;

$aws = Aws::factory('/path/to/your/config.php');
$s3 = $aws->get('S3');

$s3->putObject(array(
    'Bucket' => 'your-bucket-name',
    'Key'    => 'your-object-key',
    'Body'   => 'your-data'
));

The Body parameter can be a string of data, a file resource, or a Guzzle EntityBody object. To use a file resource, you could make a simple change to the previous code sample.

$s3->putObject(array(
    'Bucket' => 'your-bucket-name',
    'Key'    => 'your-object-key',
    'Body'   => fopen('/path/to/your/file.ext', 'r')
));

The SDK also provides a shortcut for uploading directly from a file using the SourceFile parameter, instead of the Body parameter.

$s3->putObject(array(
    'Bucket'     => 'your-bucket-name',
    'Key'        => 'your-object-key',
    'SourceFile' => '/path/to/your/file.ext'
));

When downloading an object via the GetObject operation, you can use the SaveAs parameter as a shortcut to save the object directly to a file.

$s3->getObject(array(
    'Bucket' => 'your-bucket-name',
    'Key'    => 'your-object-key',
    'SaveAs' => '/path/to/store/your/downloaded/file.ext'
));

The SourceFile and SaveAs parameters allow you to use the SDK to directly upload files to and download files from S3 very easily.

You can see more examples of how to use these parameters and perform other S3 operations in our user guide page for Amazon S3. Be sure to check out some of our other helpful S3 features, like our MultipartUpload helper and our S3 Stream Wrapper, which allows you to work with objects in S3 using PHP’s native file functions.

Fetch Object Data and Metadata from Amazon S3 (in a Single Call)

by Trevor Rowe | on | in Ruby | Permalink | Comments |  Share

I came across an excellent question earlier this week on our support forums. The question was essentially, "How can I fetch object data and metadata from Amazon S3 in a single call?"

This is fair question, also one I did not have a good answer to. Amazon S3 returns both object data and metadata in a single GET object response, while AWS::S3::S3Object#readdoes not. But why?

How It Used to Be

Here is an example of how to get data from an object in S3.

obj = s3.buckets['my-bucket'].objects['key']
data = obj.read

Notice the #read method is returning the object data. This leaves no good place to return the metadata (returning multiple values from a function in ruby is generally frowned upon). In this case, the aws-sdk gem was getting the data and metadata from S3, but it was discarding the metadata.

The Best of Both Worlds

Last year we added support for streaming reads to AWS::S3::S3Object#read. If you pass a block to #read, then the data is yielded in chunks to the block.

File.open('filename', 'wb') do |file|
  obj.read do |chunk|
    file.write(chunk)
  end
end

Perfect! Since the #read method is yielding data in chunks, its return value becomes unused. This allowed me to patch the #read method to return the object metadata instead of nil.

resp = obj.read do |chunk|
  file.write(chunk)
end

resp #=> {:meta => {"foo" => "bar"}, :restore_in_progress => false, :content_type=>"text/plain", :etag=>""37b51d194a7513e45b56f6524f2d51f2"", :last_modified => 2013-02-06 12:54:39 -0800, :content_length => 94512, :data => nil}

You can checkout the new feature on our GitHub master branch now. This will be part of our next release.

If you see an issue with the AWS SDK for Ruby (aws-sdk gem), please, post an issue on our GitHub issue tracker!