AWS Developer Blog

Concurrency in Version 3 of the AWS SDK for PHP

by Jonathan Eskew | on | in PHP | Permalink | Comments |  Share

From executing API commands in the background to grouping commands and waiters into concurrent pools, version 3 of the AWS SDK for PHP lets you write asynchronous code that blocks only when you need it to. In this blog post, I’ll show you how to take advantage of some of the SDK’s concurrency abstractions, including promises, command pools, and waiters.

Promises

The AWS SDK for PHP makes extensive use of promises internally, relying on the Guzzle Promises library. Each operation defined on an API supported by the SDK has an asynchronous counterpart that can be invoked by tacking Async to the name of the operation. An asynchronous operation will immediately return a promise that will be fulfilled with the result of the operation or rejected with an error:

$s3Client = new AwsS3S3Client([ 
    ‘region’ => ‘us-west-2’,
    ‘version’ => ‘2006-03-01’,
]);

// This call will block until a response is obtained
$buckets = $s3Client->listBuckets();

// This call will not block
$promise = $s3Client->listBucketsAsync();

// You can provide callbacks to invoke when the request finishes
$promise->then(
    function (AwsResultInterface $buckets) {
        echo ‘My buckets are: ‘
             . implode(‘, ‘, $buckets->search(‘Buckets[].Name’));
    },
    function ($reason) {
        echo "The promise was rejected with {$reason}";
    }
);

The Guzzle promises returned by asynchronous operations have a wait method that you can call if you need to block until an operation is completed. In fact, synchronous operations like $s3Client->listBuckets() are calling asynchronous operations under the hood and then waiting on the result.

Where promises really shine, though, is in groupings. Let’s say you need to upload ten files to an Amazon S3 bucket. Rather than simply loop over the files and upload them one by one, you can create ten promises and then wait for those ten requests to be complete:

$filesToUpload = [
    ‘/path/to/file/1.ext’,
    ‘/path/to/file/2.ext’,
    …
    ‘/path/to/file/10.ext’,
];
$promises = [];
foreach ($filesToUpload as $path) {
    $promises []= $s3Client->putObjectAsync([
        ‘Bucket’ => $bucketName,
        ‘Key’ => basename($path),
        ‘SourceFile’ => $path,
    ]);
}
// Construct a promise that will be fulfilled when all
// of its constituent promises are fulfilled
$allPromise = GuzzleHttpPromiseall($promises);
$allPromise->wait();

Rather than taking ten times as long as uploading a single file, the asynchronous code above will perform the uploads concurrently. For more information about promises, see the AWS SDK for PHP User Guide.

Waiters

Some AWS operations are naturally asynchronous (for example, those in which a successful response means that a process has been started, but is not necessarily complete). Provisioning Amazon EC2 instances or S3 buckets are gppd examples. If you were starting a project that required three S3 buckets and an Amazon ElastiCache cluster, you might start out by provisioning those resources programmatically:

$sdk = new AwsSdk([‘region’ => ‘us-west-2’, ‘version’ => ‘latest’]);
$elasticacheClient = $sdk->get(‘elasticache’);
$s3Client = $sdk->get(‘s3’);
$promises = [];
for ($i = 0; $i < 3; $i++) {
    $promises []= $s3Client->createBucket([
        ‘Bucket’ => “my-bucket-$i”,
    ]);
}
$cacheClusterId = uniqid(‘cache’);
$promises []= $elasticacheClient->createCacheCluster([
    ‘CacheClusterId’ => $cacheClusterId,
]);
$metaPromise = GuzzleHttpPromiseall($promises);
$metaPromise->wait();

Waiting on the $metaPromise will block only until all of the requests sent by the createBucket and createCacheCluster operations have been completed.You would need to use a waiter to block until those resources are available. For example, you can wait on a single bucket with an S3Client’s waitUntil method:

$s3Client->waitUntil(‘BucketExists’, [‘Bucket’ => $bucketName]);
Like operations, waiters can also return promises, allowing you to compose meta-waiters from individual waiters:
$waiterPromises = [];
for ($i = 0; $i < 3; $i++) {
    // Create a waiter
    $waiter = $s3Client->getWaiter(‘BucketExists’, [
        ‘Bucket’ => “my-bucket-$i”,
    ]);
    // Initiate the waiter and retrieve a promise.
    $waiterPromises []= $waiter->promise();
}
$waiterPromises []= $elasticacheClient
    ->getWaiter(‘CacheClusterAvailable’, [
        ‘CacheClusterId’ => $cacheClusterId,
    ])
    ->promise();
// Composer a higher-level promise from the individual waiter promises
$metaWaiterPromise = GuzzleHttpPromiseall($waiterPromises);
// Block until all waiters have completed
$metaWaiterPromise->wait();

Command Pools

The SDK also allows you to use command pools to fine-tune the way in which a series of operations are performed concurrentl. Command pools are created with a client object and an iterable list of commands, which can be created by calling getCommand with an operation name on any SDK client:

// Create an S3Client
$s3Client = new AwsS3S3Client([
    ‘region’ => ‘us-west-2’,
    ‘version’ => ‘latest’,
]);

// Create a list of commands
$commands = [
    $s3Client->getCommand(‘ListObjects’, [‘Bucket’ => ‘bucket1’]),
    $s3Client->getCommand(‘ListObjects’, [‘Bucket’ => ‘bucket2’]),
    $s3Client->getCommand(‘ListObjects’, [‘Bucket’ => ‘bucket3’]),
];

// Create a command pool
$pool = new AwsCommandPool($s3Client, $commands);

// Begin asynchronous execution of the commands
$promise = $pool->promise();

// Force the pool to complete synchronously
$promise->wait();

How is this different from gathering promises from individual operations, such as by calling $s3Client->listObjectsAsync(…)? One key difference is that no action is taken until you call $pool->promise(), whereas requests are dispatched immediately when you call $s3Client->listObjectsAsync(…). The CommandPool defers the initiation of calls until you explicitly tell it to do so. In addition, by default, a command pool will limit concurrency to 25 operations at a time by default, This simultaneous operation limit can be tuned according to your project’s needs. For a more complex example see CommandPool in the AWS SDK for PHP User Guide.

With promises, waiters, and command pools, version 3 of the AWS SDK for PHP makes it easy to write asynchronous or concurrent code! We welcome your feedback.

AWS SDK for Java Developer Guide Is Now Open Source

We are happy to announce that the AWS SDK for Java Developer Guide and AWS Toolkit for Eclipse User Guide are now open-sourced on GitHub! You can edit content and code inline, make pull requests, file issues, and send content suggestions to the documentation and SDK teams.

The AWS SDKs have always embraced openness as a design principle. To date, we have received more than 450 issues and 200 pull requests for the AWS SDK for Java. We decided to open the developer and user guides too, so that we can produce the highest quality, most comprehensive SDK documentation possible.

Although our writers will continue to proactively create content, we will also monitor this GitHub repo to round out the content offering for the SDK. You can star the repos, watch them, and start contributing today!

Let us know what you would like to see. Feel free to contribute your own examples, if you have them. We will look at pull requests as they come in. Your content could become part of the core documentation!

Using Custom Request Handlers

Requests are the core of the Go SDK. They handle the network request to a specific service. When you call a service method in the AWS SDK for Go, the SDK creates a request object that contains the input parameters and request lifecycle handlers. The logic involved in each step of a request’s lifecycle, such as Build, Validate, or Send, is handled by its corresponding list of handlers. For example, when the request’s Send method is called, it will iterate through and call every function in the request’s Send handler list. For the full list of handlers, see this API reference page. These handlers  shape and mold  what your request does on the way to AWS and what the response does on the way out.

Customizability is an important aspect of request handlers. By customizing its handlers, you can modify a request’s lifecycle for various use cases. You can do this for an individual request object returned by DoFooRequest client methods or, more interestingly, for an entire service client object, which will affect every request processed by the client.

An example use case would be gathering additional information on retried requests. In this case, a simple addition to the Retry handler would provide information required for further debugging. Logging is another use case. For example, a logger function can be added to log every request, or a benchmark function can be added to profile requests. Let’s see how you can add a custom logger to different handler lists.


func customLogger(req *request.Request) {
	 log.Printf(Op: %s\nParams: %v\nTime:      %s\nRetryCount: %d\n\n",
	 req.Operation.Name, req.Params, req.Time, req.RetryCount
	)
}


svc.Handlers.Send.PushBack(customLogger)
svc.Handlers.Unmarshal.PushBack(func(req *request.Request) {
log.Println(“Unmarshal handler”)
customLogger(req)
})

We added a customLogger request handler to the Sign and Send handlers and a closure function in the Unmarshal handler. The only requirement for this function is that it takes a request.Request pointer. The customLogger, which was added to the Send handler, will log the operation’s name, the parameters passed, time executed, and the retry count. The anonymous function added to Unmarshal will log when the Unmarshal handler was executed and then call the customLogger.

These functions are being added to the end of the handler list. During the request’s Send step, it will iterate through the list and execute the functions. Having custom loggers can be useful when developing and debugging applications because they provide more visibility into each API operation’s request/response layers.

In addition to logging, sometimes benchmarking or profiling requests are important. This will give you an idea of what to expect when hitting your production environments.


dt := time.Time{}
svc.Handlers.Send.PushFront(func(req *request.Request) {
	 dt = time.Now()
})
svc.Handlers.Send.PushBack(func(req *request.Request) {
	 fmt.Println("SENDING", time.Now().Sub(dt))
})
// Output:
// SENDING 201.007364ms

Here we push one function to the beginning of the Send’s list and then compute the delta time in the handler added to the end of the Send’s list to measure how long the API request took to send and receive a response.

As we have shown in these code samples, request handlers allow you to easily customize the SDK’s behavior and add new functionality. If you have used handlers to augment the SDK’s request/response functionality, please let us know in the comments!

Symmetric Encryption/Decryption in the AWS SDK for C++ with std::iostream

by Jonathan Henson | on | in C++, C++ | Permalink | Comments |  Share

Cryptography is hard in any programming language. It is especially difficult in platform-portable native code where we don’t have the advantage of a constant platform implementation. Many customers have asked us for an Amazon S3 encryption client that is compatible with the Java and Ruby clients. Although we are not ready to release that yet, we have created a way to handle encryption and decryption across each of our supported platforms: the *nix variations that rely mostly on OpenSSL; Apple, which provides CommonCrypto; and Windows which exposes the BCrypt/CNG library. Each of these libraries provide dramatically different interfaces, so we had to do a lot of work to force them into a common interface and usage pattern. As of today, you can use the AWS SDK for C++ to encrypt or decrypt your data with AES 256-bit ciphers in the CBC, CTR, and GCM modes. You can use the ciphers directly, or you can use the std::iostream interface. This new feature is valuable even without the high-level client we are hoping to provide soon, so we are excited to tell you about it.

Let’s look at a few use cases:

Suppose we want to write sensitive data to file using a std::ofstream and have it encrypted on disk. However, when we read the file back, we want to parse and use the file in a std::ifstream as plaintext. Because we use AES-GCM in this example, after encryption, the tag and iv must be stored somewhere in order for the data to be decrypted. For this example, we will simply store them into memory so we can pass them to the decryptor.


#include <aws/core/Aws.h>
#include <aws/core/utils/crypto/CryptoStream.h>
#include <fstream>

using namespace Aws::Utils;
using namespace Aws::Utils::Crypto;

int main()
{
    Aws::SDKOptions options;
    Aws::InitAPI(options);
	{
		//create 256 bit symmetric key. This will use the entropy from your platform's crypto implementation
		auto symmetricKey = SymmetricCipher::GenerateKey();

		//create an AES-256 GCM cipher, iv will be autogenerated
		auto encryptionCipher = CreateAES_GCMImplementation(symmetricKey);

		const char* fileToEncrypt = "./encryptedSensitiveData";

		CryptoBuffer iv, tag;

		//write the file to disk and encrypt it
		{        
			//create the stream to receive the encrypted data.
			Aws::OFStream outputStream(fileName, std::ios_base::out | std::ios_base::binary);

			//now create an encryption stream.
			SymmetricCryptoStream encryptionStream(outputStream, CipherMode::Encrypt, cipher);
			encryptionStream << "This is a file full of sensitive customer secrets:\n\n";
			encryptionStream << "CustomerName=John Smith\n";
			encryptionStream << "CustomerSSN=867-5309\n";
			encryptionStream << "CustomerDOB=1 January 1970\n\n";
		}

		 //grab the IV that was used to initialize the cipher
		 auto iv = encryptionCipher->GetIV();

		 //since this is GCM, grab the tag once the stream is finished and the cipher is finalized
		 auto tag = encryptionCipher->GetTag();

		//read the file back from disk and deal with it as plain-text
		{
			 //create an AES-256 GCM cipher, passing the key, iv and tag that were used for encryption
			 auto decryptionCipher = CreateAES_GCMImplementation(symmetricKey, iv, tag);

			 //create source stream to decrypt from
			 Aws::IFStream inputStream(fileName, std::ios_base::in | std::ios_base::binary);         

			 //create a decryption stream
			 SymmetricCryptoStream decryptionStream(inputStream, CipherMode::Decrypt, cipher);

			 //write the file out to cout using normal stream operations
			 Aws::String line;
			 while(std::getline(stream, line))
			 {
				 std::cout << line << std::endl;
			 }
		}        
	}
    Aws::ShutdownAPI(options);
    return 0;
}

What if this time we want to put the encrypted data from plaintext on local disk into Amazon S3? We want it to be encrypted in Amazon S3, but we want to write it to disk as plaintext when we download it. This example code is not compatible with the existing Amazon S3 encryption clients; a compatible client is coming soon.


#include <aws/core/Aws.h>
#include <aws/core/utils/crypto/CryptoStream.h>
#include <aws/s3/S3Client.h>
#include <aws/s3/model/PutObjectRequest.h>
#include <aws/s3/model/GetObjectRequest.h>

using namespace Aws::Utils;
using namespace Aws::Utils::Crypto;
using namespace Aws::S3;

int main()
{
    Aws::SDKOptions options;
    Aws::InitAPI(options);
	{
		//create 256 bit symmetric key. This will use the entropy from your platform's crypto implementation
		auto symmetricKey = SymmetricCipher::GenerateKey();    

		const char* fileName = "./localFile";
		const char* s3Key = "encryptedSensitiveData";
		const char* s3Bucket = "s3-cpp-sample-bucket";

		//create an AES-256 CTR cipher
		auto encryptionCipher = CreateAES_CTRImplementation(symmetricKey);

		//create an S3 client
		S3Client client;

		//Put object into S3
		{
			//put object
			Model::PutObjectRequest putObjectRequest;
			putObjectRequest.WithBucket(s3Bucket)
					.WithKey(s3Key);

			//source stream for file we want to put encrypted into S3
			Aws::IFStream inputFile(fileName);

			//set the body to a crypto stream and we will encrypt in transit
			putObjectRequest.SetBody(
					Aws::MakeShared<CryptoStream>("crypto-sample", inputFile, CipherMode::Encrypt, *encryptionCipher));

			auto putObjectOutcome = client.PutObject(putObjectRequest);

			if(!putObjectOutcome.IsSuccess())
			{
				std::cout << putObjectOutcome.GetError().GetMessage() << std::endl;
			}
		 }

		 auto iv = GetIV();

		 //create cipher to use for decryption of AES-256 in CTR mode.
		 auto decryptionCipher = CreateAES_CTRImplementation(symmetricKey, iv);     

		 {
			//get the object back out of s3
			Model::GetObjectRequest getObjectRequest;
			getObjectRequest.WithBucket(s3Bucket)
					.WithKey(s3Key);

			//destination stream for the decrypted result to be written to.
			Aws::OFStream outputFile(fileName);

			//tell the client to create a crypto stream that knows how to decrypt the data as it comes across the wire.
			// write the decrypted output to outputFile.
			getObjectRequest.SetResponseStreamFactory(
					[&] { return Aws::New<CryptoStream>("crypto-sample", outputFile, CipherMode::Decrypt, *decryptionCipher);}
			);

			auto getObjectOutcome = client.GetObject(getObjectRequest);

			if(!getObjectOutcome.IsSuccess())
			{
				std::cout << getObjectOutcome.GetError().GetMessage() << std::endl;
				return -1;
			}
			//the file should now be stored decrypted at fileName        
		}
	}
    Aws::ShutdownAPI(options);

    return 0;
}

Unfortunately, this doesn’t solve the problem of securely transmitting or storing symmetric keys. Soon we will provide a fully automated encryption client that will handle everything, but for now, you can use AWS Key Management Service (AWS KMS) to encrypt and store your encryption keys.

A couple of things to keep in mind:

  • You do not need to use the Streams interface directly. The SymmetricCipher interface provides everything you need for cross-platform encryption and decryption operations. The streams are mostly for convenience and interoperating with the web request process.
  • When you use the stream in sink mode, you either need to explicitly call Finalize() or make sure the destructor of the stream has been called. Finalize() makes sure the cipher has been finalized and all hashes (in GCM mode) have been computed. If this method has not been called, the tag from the cipher may not be accurate when you try to pull it.
  • Because encrypted data is binary data, be sure to use the correct stream flags for binary data. Anything that depends on null terminated c-strings can create problems. Use Strings and String-Streams only when you know that the data is plaintext. For all other scenarios, use the non-formatted input and output operations on the stream.
  • AES-GCM with 256-bit keys is not implemented for Apple platforms. CommonCrypto does not expose this cipher mode in its API. As soon as Apple adds AES-GCM to its public interface, we will provide an implementation for that platform. If you need AES-GCM on Apple, you can use our OpenSSL implementation.
  • Try to avoid seek operations. When you seek backwards, we have to reset the cipher and re-encrypt everything up to your seek position. For S3 PutObject operations, we recommend that you set the Content-Length ahead of time instead of letting us compute it automatically. Be sure to turn off payload hashing.

We hope you enjoy this feature. We look forward to seeing the ways people use it. Please leave your feedback on GitHub, and as always, feel free to send us pull requests.

Updates to Credential and Region Handling

by Steve Roberts | on | in .NET | Permalink | Comments |  Share

Version 3.1.73.0 of the AWS Tools for Windows PowerShell and AWS SDK for .NET (AWSSDK.Core version 3.1.6.0),  released today, contain enhancements to the way credentials and region data can be supplied to your SDK applications and PowerShell scripts, including the ability to use SAML federated credentials in your SDK applications. We’ve also refactored support for querying Amazon EC2 instance metadata and made it available to your code. Let’s take a look at the changes.

SDK Updates

Credential Handling

In 2015, we launched support for using SAML federated credentials with the AWS PowerShell cmdlets. (See New Support for Federated Users in the AWS Tools for Windows PowerShell.) We’ve now extended the SDK so that applications written against it can also use the SAML role profiles described in the blog post. To use this support in your application, you must first set up the role profile. The details for using the PowerShell cmdlets to do this appear in the blog post. Then, in the same way you do with other credential profiles, you simply reference the profile in your application’s app.config/web.config files with the AWSProfileName appsetting key. The SAML support to obtain AWS credentials is contained in the SDK’s Security Token Service assembly (AWSSDK.SecurityToken.dll), which is loaded at runtime. Be sure this assembly is available to your application at runtime.

The SDK has also been updated to support reading credentials from the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables (the same variables used with other AWS SDKs). For legacy support, the AWS_SECRET_KEY variable is still supported.

If no credentials are supplied to the constructors of the service clients, the SDK will probe to find a set to use. As of this release, the current probing tests are followed:

  1. If an explicit access key/secret access key or profile name is found in the application’s app.config/web.config files, use it.
  2. If a credential profile named "default" exists, use it. (This profile can contain AWS credentials or it can be a SAML role profile.)
  3. If credentials are found in environment variables, use them.
  4. Finally, check EC2 instance metadata for instance profile credentials.

Specifying Region

To set the region when you instantiated AWS service clients in your SDK applications, you used to have to two options: hard-code the region in the application code using the system name (for example, ‘us-east-1’) or the RegionEndpoint helper properties (for example, RegionEndpoint.USEast1) or supply the region system name in the application’s app.config/web.config files using the AWSRegion appsetting key. The SDK has now been updated to enable region detection through an environment variable or, if your code is running on an EC2 instance, from instance metadata.

To use an environment variable to set the AWS region, you simply set the variable AWS_REGION to the system name of the region you want service clients to use. If you need to override this for a specific client, simply pass the required region in the service client constructor. The AWS_REGION variable is used by the other AWS SDKs.

When running on an EC2 instance, your SDK-based applications will auto-detect the region in which the instance is running from EC2 instance metadata if no explicit setting is found. This means you can now deploy code without needing to hard-code any region in your app.config/web.config files. You can instead rely on the SDK to auto-detect the region when your application instantiates clients for AWS services.

Just as with credentials, if no region information is supplied to a service client constructor, the SDK probes to see if the region can be determined automatically. As of right now, these are the tests performed:

  1. Is an AWSRegion appsetting key present in the application’s app.config/web.config files? If so, use it.
  2. Is the AWS_REGION environment variable set? If so, use it.
  3. Attempt to read EC2 instance metadata and obtain the region in which the instance is running.

PowerShell Updates

Credential Handling

You can now supply credentials to the cmdlets by using the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables. (You might find this helpful when you attempt to run the cmdlets in a user identity where credential profiles are inconvenient to set up, for example, the local system account.)

If you have enabled SAML federated authentication for use with the cmdlets, they now support the use of proxy data configured using the Set-AWSProxy cmdlet when making authentication requests against the ADFS endpoint. Previously, a proxy had to be set at the machine-wide level.

When the AWS cmdlets run, they follow this series of tests to obtain credentials:

  1. If explicit credential parameters (-AccessKey, -SecretKey, -SessionToken, for example) have been supplied to the cmdlet or if a profile has been specified using the -ProfileName parameter, use those credentials. The profile supplied to -ProfileName can contain regular AWS credentials or it can be a SAML role profile.
  2. If the current shell has been configured with default credentials (held in the $StoredAWSCredentials variable), use them.
  3. If a credential profile with the name "default" exists, use it. (This profile can contain regular AWS credentials or it can be a SAML role profile.)
  4. If the new AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY environment variables are set, use the credentials they contain.
  5. If EC2 instance metadata is available, look for instance profile credentials.

Specifying Region

In addition to existing support for specifying region using the -Region parameter on cmdlets (or setting a shell-wide default using Set-DefaultAWSRegion), the cmdlets in the AWSPowerShell module can now detect region from the AWS_REGION environment variable or from EC2 instance metadata.

Some users may run the Initialize-AWSDefaults cmdlet when opening a shell on an EC2 instance. Now that you can detect region from instance metadata, the first time you run this cmdlet on an EC2 instance, you are no longer prompted to select a region from the menu. If you want to run PowerShell scripts using a region for the AWS services different from the region in which the instance is running, you can override the default detection by supplying the -Region parameter, with appropriate value, to the cmdlet. You can also continue to use the Set-DefaultAWSRegion cmdlet in your shell or scripts, or add the -Region parameter to any cmdlet to direct calls to a region that differs from the region hosting the instance.

Just as with credentials, the cmdlets will search for the appropriate region to use when invoked:

  1. If a -Region parameter was supplied to the cmdlet, use it.
  2. If the current shell contains a default region ($StoredAWSRegion variable), use it.
  3. If the AWS_REGION environment variable is set, use it.
  4. If the credential profile ‘default’ exists and it contains a default region value (set by previous use of Initalize-AWSDefaults), use it.
  5. If EC2 instance metadata is available, inspect it to determine the region.

Reading EC2 Instance Metadata

As part of extending the SDK and PowerShell tools to read region information from EC2 instance metadata, we have refactored the metadata reader class (Amazon.EC2.Util.EC2Metadata) from the AWSSDK.EC2.dll assembly into the core runtime (AWSSDK.Core.dll) assembly. The original class has been marked obsolete.

The replacement class is Amazon.Util.EC2InstanceMetadata. It contains additional helper methods to read more of the EC2 instance metadata than the original class. For example, you can now read from the dynamic data associated with the instance. For more information, see [http://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/ec2-instance-metadata.html ec2-instance-metadata.html]. Region information is held in what is known as the identity document for the instance. The document is in JSON format. The class contains a helper property, Region, which extracts the relevant data for you and returns it as a RegionEndpoint instance, making it super easy to query this in your own applications. You can also easily read the instance monitoring, signature, and PKCS7 data from convenient properties.

We’ve not forgotten scripters either! Previously, to read instance metadata from PowerShell, you had to have run the Invoke-WebRequest cmdlet against the metadata endpoint and parsed the data yourself. The AWSPowerShell module now contains a cmdlet dedicated to the task: Get-EC2InstanceMetadata. Some examples:

PS C:UsersAdministrator> Get-EC2InstanceMetadata -Category LocalIpv4
10.232.46.188

PS C:UsersAdministrator> Get-EC2InstanceMetadata -Category AvailabilityZone
us-west-2a

PS C:UsersAdministrator> Get-EC2InstanceMetadata -ListCategory
AmiId
LaunchIndex
ManifestPath
AncestorAmiId
BlockDeviceMapping
InstanceId
InstanceType
LocalHostname
LocalIpv4
KernelId
AvailabilityZone
ProductCode
PublicHostname
PublicIpv4
PublicKey
RamdiskId
Region
ReservationId
SecurityGroup
UserData
InstanceMonitoring
IdentityDocument
IdentitySignature
IdentityPkcs7

PS C:UsersAdministrator> Get-EC2InstanceMetadata -path /public-keys/0/openssh-key
ssh-rsa AAAAB3N...na27jfTV keypairname

We hope you find these new capabilities helpful. Be sure to let us know in the comments if there are other scenarios we should look at!

AWS SDK for C++: Simplified Configuration and Initialization

by Jonathan Henson | on | in C++, C++ | Permalink | Comments |  Share

Many of our users are confused by initializing and installing a memory manager, enabling logging, overriding the HTTP stack, and installing custom cryptography implementations. Not only are these tasks confusing, they are tedious and require an API call to set up and tear down each component. To make matters worse, on some platforms, we were silently initializing libCurl and OpenSSL. This caused the AWS SDK for C++ to mutate static global state, creating problems for the programs that relied on it.

As of version 0.12.x, we have added a new initialization and shutdown process. We have added the structure Aws::SDKOptions to store each of the SDK-wide configuration options. You can use SDKOptions to set a memory manager, turn on logging, provide a custom logger, override the HTTP implementation, and install your own cryptography implementation at runtime. By default, this structure has all of the settings you are used to, but manipulating those options should now be more clear and accessible.

This change has a few side effects.

  • First, the HTTP factory is now globally installed instead of being passed to service-client constructors. It doesn’t really make sense to force the HTTP client factory through each client; if this setting is being customized, it will be used across the SDK.
  • Second, you can turn on logging simply by setting the logging level.
  • Third, if your application is already using OpenSSL or libCurl, you have the option of bypassing their initialization in the SDK entirely. This is particularly useful for legacy applications that need to interoperate with the AWS SDK.
  • Finally, all code using the SDK must call the Aws::InitAPI() function before making any other API calls and the Aws::ShutdownAPI() function when finished using the SDK. If you do not call these new functions, your application may crash for all SDK versions later than 0.12.x.

Here are a few recipes:


#include <aws/core/Aws.h>

Just use the default configuration:


   SDKOptions options;
   Aws::InitAPI(options);
   //make your SDK calls in here.
   Aws::ShutdownAPI(options);

Turn logging on using the default logger:


   SDKOptions options;
   options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Info;
   Aws::InitAPI(options);
   //make your SDK calls in here.
   Aws::ShutdownAPI(options);

Install a custom memory manager:


    MyMemoryManager memoryManager;

    SDKOptions options;
    options.memoryManagementOptions.memoryManager = &memoryManager;
    Aws::InitAPI(options);
    //make your SDK calls in here.
    Aws::ShutdownAPI(options);

Override the default HTTP client factory:


    SDKOptions options;
    options.httpOptions.httpClientFactory_create_fn = [](){ return Aws::MakeShared<MyCustomHttpClientFactory>("ALLOC_TAG", arg1); };
    Aws::InitAPI(options);
    //make your SDK calls in here
    Aws::ShutdownAPI(options);

Note: SDKOptions::HttpOptions takes a closure instead of a std::shared_ptr. We do this for all of the factory functions because the memory manager will not have been installed at the time you will need to allocate this memory. You pass your closure to the SDK, and it is called when it is safe to do so. This simplest way to do this is with a Lambda expression.

As we get ready for General Availability, we wanted to refine our initialization and shutdown scheme to be flexible for future feature iterations and new platforms. This update will allow us to provide new features without breaking users. We welcome your feedback about how we can improve this feature.

How to Analyze AWS Config Snapshots with ElasticSearch and Kibana

by Vladimir Budilov | on | in Python | Permalink | Comments |  Share

Introduction
In this blog post, I will walk you through a turn-key solution that includes one of our most recently released services, AWS Config. This solution shows how to automate the ingestion of your AWS Config snapshots into the ElasticSearch/Logstash/Kibana (ELK) stack for searching and mapping your AWS environments. Using this functionality, you can do free-form searches, such as “How many EC2 instances are tagged PROD?” or “How many EC2 instances are currently connected to this master security group?”

Prerequisites
In this post, I assume that you have an ELK stack up and running. (Although the “L” isn’t really required, the ELK acronym has stuck, so I’ll continue to use it.)

Here are some ways to get the ELK stack up and running:

  1. You can use our Amazon ElasticSearch Service, which provides the two main components you’ll be using: ElasticSearch and Kibana.
  2. Take a look at this excellent post by Logz.io. It provides step-by-step instructions for installing the ELK stack on an EC2 instance.
  3. You can install Docker locally or create an Amazon EC2 Container Service (Amazon ECS) cluster and then install the ELK Docker image. Follow the instructions here.

You can download the python app referenced in this post from https://github.com/awslabs/aws-config-to-elasticsearch

Why AWS Config?
AWS Config provides a detailed view of your configurations of your AWS resources and their relationships to other resources. For example, you can find out which resources are set up in your default VPC or which Availability Zone has the most EC2 instances. AWS Config also captures the history of configuration changes made to these resources and allows you to look them up through an API. The service allows you to create one-time snapshots or turn on configuration recording, which provides change snapshots and notifications.

Why ELK?
ElasticSearch and Kibana are some of the most popular free, open-source solutions out there to analyze and visualize data. ElasticSearch, which is built on the Lucene search engine, allows for schema-less data ingestion and querying. It provides out-of-the-box data analysis queries and filters, such as data aggregates and term counts. Kibana is the visualization and searching UI that opens up the ElasticSearch data to the regular user.

The Solution
I’ve created a Python app that automates the process of getting AWS Config data from your AWS account to ELK. In short, it asks AWS Config to take a snapshot in each region in which you have the service enabled; waits until the snapshot is uploaded to the configured Amazon S3 bucket; copies the snapshot from the S3 bucket; parses the snapshot (which is just a huge JSON blob); and ingests the JSON array elements into ELK.

Running the Script
You have a couple of options when you run the app. You can specify the region that you want to export and load by including -r and the region name as shown:

./esingest.py –d localhost:9200 –r us-east-1

Or you can simply include the destination (which is required). The app will loop over all of the regions. The following output is an example of what you would see if you don’t specify the region:

./esingest.py –d localhost:9200

figure-1-app-output

Working with Kibana
Now that you have ingested the data into ElasticSearch, you need to use Kibana to index the data. The first time you open Kibana, the Settings page will be displayed. Use this page to configure the searchable index. For simplicity’s sake, under Index name or pattern, type *, and for Time-field name, choose snapshotTimeIso. You can use any date field from the drop-down list, such as resourceCreationTime:

figure-2-kibana-configuration

This will index all of your ElasticSearch indices and use the snapshotTimeIso as the time-series field. You will have duplicates if you run esingest without deleting the current ELK indices, but you will be able to include the snapshot time in your search queries to get time-based results.

Now that we have indexed the data in Kibana, let’s do some searching. Choose the Discover tab and change the time filter by clicking the text in the upper-right corner:

figure-3-kibana-discover

For now, choose Last 5 years, and then minimize the Time Filter section.

For our first search, type resourceType: "aws::ec2::instance" in the text field. You will see all of your EC2 instances in the search results. The time graph shows when they were added to ElasticSearch. Because I ran esingest just once, there’s only one Config snapshot loaded, and only one timestamp will show up.

figure-4-kibana-search-instances

There are many other search queries you can use. Kibana supports the Lucene query syntax, so see this tutorial for examples and ideas.
As you can see, the time filter shows when the data was ingested into ElasticSearch. You might have duplicates here, so you can specify the instance ID and the exact snapshot time (input: resourceType: “*Instance*” AND “sg-a6f641c0*”)

figure-5-search-instances-and-securitygroup

Kibana Visualize Functionality
In addition to search functionality, Kibana provides a way to visualize search results and create search slices. Let’s look at some real-world use cases that I’ve encountered while talking to customers. Click the Visualize tab, choose Pie Chart, and start exploring!

What’s my EC2 distribution between Availability Zones?
Input: resourceType: “aws::ec2::Instance”

figure-6-kibana-visuralize-instances

Let’s create a sub-aggregation and add the tags that are assigned to those EC2 instances:

Input: resourceType: “aws::ec2::Instance”

figure-7-kibana-visualize-instances

Which AMIs were used to create your EC2 instances, and when were they created?
Input: *

figure-8-kibana-visualize-instances-and-regions

How many instances use a security group that you have set up?
Input: “sg-a6f641c0*”

figure-9-kibana-visualize-instances-and-sg

Conclusion
AWS Config is a useful tool for understanding what’s running in your AWS account. The combination of ELK and AWS Config offers AWS admins a lot of advantages that are worth exploring.

Serverless Service Discovery: Part 4: Registrar

by Magnus Bjorkman | on | in Python | Permalink | Comments |  Share

In this, the last part of our serverless service discovery series, we will show how to register and look up a new service. We will add these components:

AWS Lambda Registrar Agent

In Docker, it is common to have container agents that add functionality to your Docker deployment. We will borrow from this concept and build a Lambda registrar agent that will manage the registration and monitoring of a service.


def component_status(lambda_functions, rest_api_id):
    """Checking component status of REST API."""
    any_existing = False
    any_gone = False
    client = boto3.client('lambda')
    for lambda_function in lambda_functions:
        try:
            logger.info("checking Lambda: %s" % (lambda_function,))
            client.get_function_configuration(
                            FunctionName=lambda_function)
            any_existing = True
        except botocore.exceptions.ClientError:
            any_gone = True

    client = boto3.client('apigateway')
    try:
        logger.info("checking Rest API: %s" % (rest_api_id,))
        client.get_rest_api(restApiId=rest_api_id)
        any_existing = True
    except botocore.exceptions.ClientError:
        any_gone = True

    if (not any_existing):
        return "service_removed"
    elif (any_gone):
        return "unhealthy"
    else:
        return "healthy"


def lambda_handler(event, context):
    """Lambda hander for agent service registration."""
    with open('tmp/service_properties.json') as json_data:
        service_properties = json.load(json_data)

    logger.info("service_name: %s" % (service_properties['service_name'],))
    logger.info("service_version: %s" % (service_properties['service_version'],))

    status = component_status(service_properties['lambda_functions'],
                              service_properties['rest_api_id'])

    register_request = {
            "service_name": service_properties['service_name'],
            "service_version": service_properties['service_version'],
            "endpoint_url": service_properties['endpoint_url'],
            "ttl": "300"
            }
    if (status == 'healthy'):
        logger.info('registering healthy service')

        register_request["status"] = 'healthy'

        response = signed_post(
          service_properties['discovery_service_endpoint']+"/catalog/register",
          "us-east-1",
          "execute-api",
          json.dumps(register_request))


    elif (status == 'unhealthy'):
        logger.info('registering unhealthy service')

        register_request["status"] = 'unhealthy'

        response = signed_post(
          service_properties['discovery_service_endpoint']+"/catalog/register",
          "us-east-1",
          "execute-api",
          json.dumps(register_request))

    else:
        logger.info('removing service and registrar')

        deregister_request = {
            "service_name": service_properties['service_name'],
            "service_version": service_properties['service_version']
            }

        response = signed_post(
            service_properties['discovery_service_endpoint'] +
            "/catalog/deregister",
            "us-east-1",
            "execute-api",
            json.dumps(deregister_request))

        client = boto3.client('lambda')
        client.delete_function(
                 FunctionName=service_properties['registrar_name'])

The Lambda registrar agent is packaged with a property file that defines the Lambda functions and Amazon API Gateway deployment that are part of the service. The registrar agent uses the component_status function to inspect the state of those parts and takes action, depending on what it discovers:

  • If all of the parts are there, the service is considered healthy. The register function is called with the service information and a healthy status.
  • If only some of the parts are there, the service is considered unhealthy. The register function is called with the service information and an unhealthy status.
  • If none of the parts are there, the service is considered to have been removed. The deregister function is called, and the Lambda agent will delete itself because it is no longer needed.

Subsequent register function calls will overwrite the information, so as the health status of our services changes, we can call the function repeatedly. In fact, when we deploy the agent with our Hello World service, we will show how to put the Lambda registrar agent on a five-minute schedule to continuously monitor our service.

Deploy the Hello World Service with the Lambda Agent

We will first implement our simple Hello World Lambda function:


def lambda_handler(api_parameters, context):
    """Hello World Lambda function."""
    return {
            "message": "Hello "+api_parameters['name']
            }

We will create a Swagger file for the service:


{
  "swagger": "2.0",
  "info": {
    "title": "helloworld_service",
    "version": "1.0.0"
  },
  "basePath": "/v1",
  "schemes": ["https"],
  "consumes": ["application/json"],
  "produces": ["application/json"],
  "paths": {
    "/helloworld/{name}": {
      "parameters": [{
        "name": "name",
        "in": "path",
        "description": "The name to say hello to.",
        "required": true,
        "type": "string"
      }],
      "get": {
        "responses": {
          "200": {
            "description": "Hello World message"
          }
        },
        "x-amazon-apigateway-integration": {
          "type": "aws",
          "uri": "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/$helloworld_serviceARN$/invocations",
          "httpMethod": "POST",
          "requestTemplates": {
            "application/json": "{\"name\": \"$input.params('name')\"}"
          },
          "responses": {
            "default": {
              "statusCode": "200",
              "schema": {
                "$ref": "#/definitions/HelloWorldModel"
              }
            }
          }
        }
      }
    }
  },
  "definitions": {
    "HelloWorldModel": {
      "type": "object",
      "properties": {
        "message": {
          "type": "string"
        }
      },
      "required": ["message"]
    }
  }
}

Now we are ready to pull everything we have done in this blog series together: we will deploy this service with a Lambda registrar agent that registers and deregisters it with our serverless discovery service. First, we need to add the requests Python module to the directory we are deploying from because our Lambda registrar agent is dependent on it.


pip install requests -t /path/to/project-dir

Second, we deploy the Hello World service and the Lambda registrar agent:


ACCOUNT_NUMBER = _your aws account number

######################################
# Deploy Hello World Service
######################################
create_deployment_package("/tmp/helloworld.zip", ["helloworld_service.py"])
hello_world_arn = create_lambda_function(
                       "/tmp/helloworld.zip",
                       "helloworld_service",
                       "arn:aws:iam::"+ACCOUNT_NUMBER+":role/lambda_s3",
                       "helloworld_service.lambda_handler",
                       "Hello World service.",
                       ACCOUNT_NUMBER)
replace_instances_in_file("swagger.json",
                          "/tmp/swagger_with_arn.json",
                          "$helloworld_serviceARN$",
                          hello_world_arn)
api_id = create_api("/tmp/swagger_with_arn.json")
rest_api_id, stage, endpoint_url = deploy_api(api_id, "/tmp/swagger_with_arn.json", "dev")

######################################
# Deploy Lambda Registrar Agent
######################################
with open('/tmp/service_properties.json',
          'w') as outfile:
    json.dump(
      {
       "lambda_functions": ["helloworld_service"],
       "rest_api_id": rest_api_id,
       "stage": stage,
       "endpoint_url": endpoint_url,
       "service_name": "helloworld",
       "service_version": "1.0",
       "discovery_service_endpoint":
       "https://1vvw0qvh4i.execute-api.us-east-1.amazonaws.com/dev",
       "registrar_name": "registrar_"+rest_api_id
       }, outfile)

create_deployment_package("/tmp/helloworld_registrar.zip",
                          ["registrar.py", "/tmp/service_properties.json",
                           "requests"])
registrar_arn = create_lambda_function(
                       "/tmp/helloworld_registrar.zip",
                       "registrar_"+rest_api_id,
                       "arn:aws:iam::"+ACCOUNT_NUMBER+":role/lambda_s3",
                       "registrar.lambda_handler",
                       "Registrar for Hello World service.",
                       ACCOUNT_NUMBER)

After we have deployed the Hello World service, we create a JSON file (service_properties.json) with some of the outputs from that deployment. This JSON file is packaged with the Lambda registrar agent.

Both the service and the agent are now deployed, but nothing is triggering the agent to execute. We will use the following to create a five-minute monitoring schedule using CloudWatch events:


client = boto3.client('events')
response = client.put_rule(
    Name="registrar_"+rest_api_id,
    ScheduleExpression='rate(5 minutes)',
    State='ENABLED'
)
rule_arn = response['RuleArn']

lambda_client = boto3.client('lambda')
response = lambda_client.add_permission(
        FunctionName=registrar_arn,
        StatementId="registrar_"+rest_api_id,
        Action="lambda:InvokeFunction",
        Principal="events.amazonaws.com",
        SourceArn=rule_arn
    )

response = client.put_targets(
    Rule="registrar_"+rest_api_id,
    Targets=[
        {
            'Id': "registrar_"+rest_api_id,
            'Arn': registrar_arn
        },
    ]
)

Now we have deployed a service that is being continuously updated in the discovery service. We can use it like this:


############################
# 1. Do service lookup
############################
request_url="https://yourrestapiid.execute-api.us-east-1.amazonaws.com/"\
            "dev/catalog/helloworld/1.0"
response = requests.get(request_url)
json_response = json.loads(response.content)


############################
# 2. Use the service
############################
request_url=("%s/helloworld/Magnus" % (json_response['endpoint_url'],))

response = requests.get(request_url)
json_response = json.loads(response.content)
logger.info("Message: %s" % (json_response['message'],))

We should get the following output:


INFO:root:Message: Hello Magnus

Summary

We have implemented a fairly simple but functional discovery service without provisioning any servers or containers. We can build on this by adding more advanced monitoring, circuit breakers, caching, additional protocols for discovery, etc. By providing a stable host name for our discovery service (instead of the one generated by API Gateway), we can make that a central part of our microservices architecture.

We showed how to use Amazon API Gateway and AWS Lambda to build a discovery service using Python, but the approach is general. It should work for other services you want to build. The examples provided for creating and updating the services can be enhanced and integrated into any CI/CD platforms to create a fully automated deployment pipeline.

Serverless Service Discovery: Part 3: Registration

by Magnus Bjorkman | on | in Python | Permalink | Comments |  Share

In this, the third part of our serverless service discovery series, we will show how to configure Amazon API Gateway to require AWS Identity and Access Management (IAM) for authentication and how to create a V4 signature to call our register and deregister methods.

We have created all the functions required to manage our API and code, so we can jump directly into creating our new functions.

Registering and Deregistering Services

We start by creating a Lambda function for registering a service:


def lambda_handler(api_parameters, context):
    """Lambda hander for registering a service."""
    logger.info("lambda_handler - service_name: %s"
                " service_version: %s"
                % (api_parameters["service_name"],
                   api_parameters["service_version"]))

    table = boto3.resource('dynamodb',
                           region_name='us-east-1').Table('Services')

    table.put_item(
           Item={
                'name': api_parameters["service_name"],
                'version': api_parameters["service_version"],
                'endpoint_url': api_parameters["endpoint_url"],
                'ttl': int(api_parameters["ttl"]),
                'status': api_parameters["status"],
            }
        )

This function takes the input and stores it in Amazon DynamoDB. If you call the function with the same service name and version (our DynamoDB key), then it will overwrite the existing item.

Followed by the function to deregister:


def lambda_handler(api_parameters, context):
    """Lambda hander for deregistering a service."""
    logger.info("lambda_handler - service_name: %s"
                " service_version: %s"
                % (api_parameters["service_name"],
                   api_parameters["service_version"]))

    table = boto3.resource('dynamodb',
                           region_name='us-east-1').Table('Services')

    table.delete_item(
            Key={
                'name': api_parameters["service_name"],
                'version': api_parameters["service_version"]
            }
        )

The function removes the item from the DynamoDB table based on the service name and version.

We need to add the new functions and API methods to the Swagger file:


{
  "swagger": "2.0",
  "info": {
    "title": "catalog_service",
    "version": "1.0.0"
  },
  "basePath": "/v1",
  "schemes": ["https"],
  "consumes": ["application/json"],
  "produces": ["application/json"],
  "paths": {
    "/catalog/{serviceName}/{serviceVersion}": {
      "parameters": [{
        "name": "serviceName",
        "in": "path",
        "description": "The name of the service to look up.",
        "required": true,
        "type": "string"
      },
      {
        "name": "serviceVersion",
        "in": "path",
        "description": "The version of the service to look up.",
        "required": true,
        "type": "string"
      }],
      "get": {
        "responses": {
          "200": {
            "description": "version information"
          },
          "404": {
            "description": "service not found"
          }
        },
        "x-amazon-apigateway-integration": {
          "type": "aws",
          "uri": "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/$catalog_serviceARN$/invocations",
          "httpMethod": "POST",
          "requestTemplates": {
            "application/json": "{\"service_name\": \"$input.params('serviceName')\",\"service_version\": \"$input.params('serviceVersion')\"}"
          },
          "responses": {
            "default": {
              "statusCode": "200",
              "schema": {
                "$ref": "#/definitions/CatalogServiceModel"
              }
            },
            ".*NotFound.*": {
              "statusCode": "404",
              "responseTemplates" : {
                 "application/json": "{\"error_message\":\"Service Not Found\"}"
                } 
            } 
          }
        }
      }
    },
    "/catalog/register": {
      "post": {
        "responses": {
          "201": {
            "description": "service registered"
          }
        },
        "parameters": [{
          "name": "body",
          "in": "body",
          "description": "body object",
          "required": true,
          "schema": {
            "$ref":"#/definitions/CatalogRegisterModel"
          }
        }],
        "x-amazon-apigateway-auth" : {
          "type" : "aws_iam" 
        },
        "x-amazon-apigateway-integration": {
          "type": "aws",
          "uri": "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/$catalog_registerARN$/invocations",
          "httpMethod": "POST",
          "requestTemplates": {
            "application/json": "$input.json('$')"
          },
          "responses": {
            "default": {
              "statusCode": "201"
            } 
          }
        }
      } 
    },
    "/catalog/deregister": {
      "post": {
        "responses": {
          "201": {
            "description": "service deregistered"
          }
        },
        "parameters": [{
          "name": "body",
          "in": "body",
          "description": "body object",
          "required": true,
          "schema": {
            "$ref":"#/definitions/CatalogDeregisterModel"
          }
        }],
        "x-amazon-apigateway-auth" : {
          "type" : "aws_iam" 
        },
        "x-amazon-apigateway-integration": {
          "type": "aws",
          "uri": "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/$catalog_deregisterARN$/invocations",
          "httpMethod": "POST",
          "requestTemplates": {
            "application/json": "$input.json('$')"
          },
          "responses": {
            "default": {
              "statusCode": "201"
            } 
          }
        }
      } 
    }
  },
  "definitions": {
    "CatalogServiceModel": {
      "type": "object",
      "properties": {
        "endpoint_url": {
          "type": "string"
        },
        "ttl": {
          "type": "integer"
        },
        "status": {
          "type": "string"
        }
      },
      "required": ["endpoint_url", "ttl", "status"]
    },
    "CatalogRegisterModel": {
      "type": "object",
      "properties": {
        "service_name": {
          "type": "string"
        },
        "service_version": {
          "type": "string"
        },
        "endpoint_url": {
          "type": "string"
        },
        "ttl": {
          "type": "integer"
        },
        "status": {
          "type": "string"
        }
      },
      "required": ["service_name","service_version","endpoint_url", "ttl", "status"]
    },
    "CatalogDeregisterModel": {
      "type": "object",
      "properties": {
        "service_name": {
          "type": "string"
        },
        "service_version": {
          "type": "string"
        }
      },
      "required": ["service_name","service_version"]
    }
  }
}

The new methods will be POST-based, so we need to define models (CatalogRegisterModel and CatalogDeregisterModel) for the data passed through the method body. After API Gateway processes the models, the JSON objects will be passed, as is, to the Lambda functions.

We set the x-amazon-apigateway-auth element to the type of aws_iam for the register and deregister methods, so API Gateway will require a V4 signature when we access them.

We can now deploy our new functions:


ACCOUNT_NUMBER = _your account number_

create_deployment_package("/tmp/catalog_register.zip", ["catalog_register.py"])
catalog_register_arn = create_lambda_function(
                       "/tmp/catalog_register.zip",
                       "catalog_register",
                       "arn:aws:iam::"+ACCOUNT_NUMBER+":role/lambda_s3",
                       "catalog_register.lambda_handler",
                       "Registering a service.",
                       ACCOUNT_NUMBER)
replace_instances_in_file("swagger.json",
                          "/tmp/swagger_with_arn.json",
                          "$catalog_registerARN$", catalog_register_arn)
create_deployment_package("/tmp/catalog_deregister.zip",
                          ["catalog_deregister.py"])
catalog_deregister_arn = create_lambda_function(
                       "/tmp/catalog_deregister.zip",
                       "catalog_deregister",
                       "arn:aws:iam::"+ACCOUNT_NUMBER+":role/lambda_s3",
                       "catalog_deregister.lambda_handler",
                       "Deregistering a service.",
                       ACCOUNT_NUMBER)
replace_instances_in_file("/tmp/swagger_with_arn.json",
                          "/tmp/swagger_with_arn.json",
                          "$catalog_deregisterARN$", catalog_deregister_arn)
catalog_service_arn = get_function_arn("catalog_service")
replace_instances_in_file("/tmp/swagger_with_arn.json",
                          "/tmp/swagger_with_arn.json",
                          "$catalog_serviceARN$", catalog_service_arn)
api_id = update_api("/tmp/swagger_with_arn.json")
deploy_api(api_id, "/tmp/swagger_with_arn.json", "dev")

We can try out the new register service like this:


json_body = {
            "service_name": "registerservice3",
            "service_version": "1.0",
            "endpoint_url": "notarealurlregister3",
            "ttl": "300",
            "status": "healthy"
            }
request_url = "https://yourrestapi.execute-api.us-east-1.amazonaws.com/"\
              "dev/catalog/register"
response = requests.post(
            request_url,
            data=json.dumps(json_body))
if(not response.ok):
    logger.error("Error code: %i" % (response.status_code,))


We should get something like this:


ERROR:root:Error code: 403

Signing a Request with Signature Version 4

To successfully call our new services, we need to implement a client that will sign the request to the API with a Version 4 signature. First we implement the functions that creates the signature:


from botocore.credentials import get_credentials
from botocore.session import get_session
import requests
import json
import logging
import sys
import datetime
import hashlib
import hmac
import urlparse
import urllib
from collections import OrderedDict

def sign(key, msg):
    """Sign string with key."""
    return hmac.new(key, msg.encode('utf-8'), hashlib.sha256).digest()


def getSignatureKey(key, dateStamp, regionName, serviceName):
    """Create signature key."""
    kDate = sign(('AWS4' + key).encode('utf-8'), dateStamp)
    kRegion = sign(kDate, regionName)
    kService = sign(kRegion, serviceName)
    kSigning = sign(kService, 'aws4_request')
    return kSigning


def create_canonical_querystring(params):
    """Create canonical query string."""
    ordered_params = OrderedDict(sorted(params.items(), key=lambda t: t[0]))
    canonical_querystring = ""
    for key, value in ordered_params.iteritems():
        if len(canonical_querystring) > 0:
            canonical_querystring += ","
        canonical_querystring += key+"="+value
    return canonical_querystring


def sign_request(method, url, credentials, region, service, body=''):
    """Sign a HTTP request with AWS V4 signature."""
    ###############################
    # 1. Create a Canonical Request
    ###############################
    t = datetime.datetime.utcnow()
    amzdate = t.strftime('%Y%m%dT%H%M%SZ')
    # Date w/o time, used in credential scope
    datestamp = t.strftime('%Y%m%d')

    # Create the different parts of the request, with content sorted
    # in the prescribed order
    parsed_url = urlparse.urlparse(url)
    canonical_uri = parsed_url.path
    canonical_querystring = create_canonical_querystring(
                              urlparse.parse_qs(parsed_url.query))
    canonical_headers = ("host:%s\n"
                         "x-amz-date:%s\n" %
                         (parsed_url.hostname, amzdate))
    signed_headers = 'host;x-amz-date'
    if (not (credentials.token is None)):
        canonical_headers += ("x-amz-security-token:%s\n") % (credentials.token,)
        signed_headers += ';x-amz-security-token'

    payload_hash = hashlib.sha256(body).hexdigest()
    canonical_request = ("%s\n%s\n%s\n%s\n%s\n%s" %
                         (method,
                          urllib.quote(canonical_uri),
                          canonical_querystring,
                          canonical_headers,
                          signed_headers,
                          payload_hash))

    #####################################
    # 2. Create a String to Sign
    #####################################
    algorithm = 'AWS4-HMAC-SHA256'
    credential_scope = ("%s/%s/%s/aws4_request" % 
                        (datestamp,
                         region,
                         service))
    string_to_sign = ("%s\n%s\n%s\n%s" %
                       (algorithm,
                        amzdate,
                        credential_scope,
                        hashlib.sha256(canonical_request).hexdigest()))
    #####################################
    # 3. Create a Signature
    #####################################
    signing_key = getSignatureKey(credentials.secret_key,
                                  datestamp, region, service)
    signature = hmac.new(signing_key, (string_to_sign).encode('utf-8'),
                         hashlib.sha256).hexdigest()

    ######################################################
    # 4. Assemble request to it can be used for submission
    ######################################################
    authorization_header = ("%s Credential=%s/%s, "
                            "SignedHeaders=%s, "
                            "Signature=%s" %
                            (algorithm,
                             credentials.access_key,
                             credential_scope,
                             signed_headers,
                             signature))
    headers = {'x-amz-date': amzdate, 'Authorization': authorization_header}
    if (not (credentials.token is None)):
        headers['x-amz-security-token'] = credentials.token
    request_url = ("%s://%s%s" % 
                   (parsed_url.scheme,parsed_url.netloc,canonical_uri))
    if (len(canonical_querystring) > 0):
        request_url += ("?%s" % (canonical_querystring,))

    return request_url, headers, body

The main function, sign_request, can sign requests for both POST and GET methods. It also works with both short and long term credentials. For more information about creating Signature Version 4 requests, see Signing Requests

We implement the following method to submit a POST request:


def signed_post(url, region, service, data, **kwargs):
    """Signed post with AWS V4 Signature."""
    credentials = get_credentials(get_session())

    request_url, headers, body = sign_request("POST", url, credentials, region,
                                              service, body=data)

    return requests.post(request_url, headers=headers, data=body, **kwargs)

We are using botocore functionality to get the configured keys on the instance we are running. If we are running this on an Amazon EC2 instance or AWS Lambda, botocore will use the configured IAM role.

We can now test the service by calling register:


json_body = {
            "service_name": "registerservice6",
            "service_version": "1.0",
            "endpoint_url": "notarealurlregister6",
            "ttl": "300",
            "status": "healthy"
            }
request_url = "https://yourrestapiid.execute-api.us-east-1.amazonaws.com/"\
              "dev/catalog/register"
response = signed_post(
            request_url,
            "us-east-1",
            "execute-api",
            json.dumps(json_body))
if(not response.ok):
    logger.error("Error code: %i" % (response.status_code,))
else:
    logger.info("Successfully registered the service.")

The test should complete without a failure. To test, look up this item:


request_url="https://your_rest_api_id.execute-api.us-east-1.amazonaws.com/"\
            "dev/v1/catalog/registerservice6/1.0"
response = requests.get(request_url)
json_response = json.loads(response.content)
logging.info("Endpoint URL: %s" % (json_response['endpoint_url'],))
logging.info("TTL: %i" % (json_response['ttl'],))
logging.info("Status: %s" % (json_response['status'],))

You should get the following output:


INFO:root:Endpoint URL: notarealurlregister6
INFO:root:TTL: 300
INFO:root:Status: healthy

Using a Thread Pool with the AWS SDK for C++

by Jonathan Henson | on | in C++, C++ | Permalink | Comments |  Share

The default thread executor implementation we provide for asynchronous operations spins up a thread and then detaches it. On modern operating systems, this is often exactly what we want. However, there are some other use cases for which this simply will not work. For example, suppose we want to fire off asynchronous calls to Amazon Kinesis as quickly as we receive an event. Then suppose that we sometimes receive these events at a rate of 10 per millisecond. Even if we are calling Amazon Kinesis from an Amazon Elastic Compute Cloud (EC2) instance in the same data center as our Amazon Kinesis stream, the latency will eventually cause the number of threads on our system to bloat and possibly exhaust.

Here is an example of what this code might look like:


#include <aws/kinesis/model/PutRecordsRequest.h>
#include <aws/kinesis/KinesisClient.h>
#include <aws/core/utils/Outcome.h>
#include <aws/core/utils/memory/AWSMemory.h>
#include <aws/core/Aws.h>

using namespace Aws::Client;
using namespace Aws::Utils;
using namespace Aws::Kinesis;
using namespace Aws::Kinesis::Model;

class KinesisProducer
{
public:
    KinesisProducer(const Aws::String& streamName, const Aws::String& partition) : m_partition(partition), m_streamName(streamName)
    {
        ClientConfiguration clientConfiguration;
        m_client = Aws::New<KinesisClient>("kinesis-sample", clientConfiguration);
    }

    ~KinesisProducer()
    {
        Aws::Delete(m_client);
    }

    void StreamData(const Aws::Vector<ByteBuffer>& data)
    {
        PutRecordsRequest putRecordsRequest;
        putRecordsRequest.SetStreamName(m_streamName);

        for(auto& datum : data)
        {
            PutRecordsRequestEntry putRecordsRequestEntry;
            putRecordsRequestEntry.WithData(datum)
                    .WithPartitionKey(m_partition);

            putRecordsRequest.AddRecords(putRecordsRequestEntry);
        }

        m_client->PutRecordsAsync(putRecordsRequest,
               std::bind(&KinesisProducer::OnPutRecordsAsyncOutcomeReceived, this, std::placeholders::_1, std::placeholders::_2, std::placeholders::_3, std::placeholders::_4));
    }

private:
    void OnPutRecordsAsyncOutcomeReceived(const KinesisClient*, const Model::PutRecordsRequest&,
                                          const Model::PutRecordsOutcome& outcome, const std::shared_ptr<const Aws::Client::AsyncCallerContext>&)
    {
        if(outcome.IsSuccess())
        {
            std::cout << "Records Put Successfully " << std::endl;
        }
        else
        {
            std::cout << "Put Records Failed with error " << outcome.GetError().GetMessage() << std::endl;
        }
    }

    KinesisClient* m_client;
    Aws::String m_partition;
    Aws::String m_streamName;
};


int main()
{
    Aws::SDKOptions options;
    Aws::InitAPI(options);
	{
		KinesisProducer producer("kinesis-sample", "announcements");

		while(true)
		{
			Aws::String event1("Event #1");
			Aws::String event2("Event #2");

			producer.StreamData( {
										 ByteBuffer((unsigned char*)event1.c_str(), event1.length()),
										 ByteBuffer((unsigned char*)event2.c_str(), event2.length())
								 });
		}
	}
    Aws::ShutdownAPI(options);
    return 0;
}


This example is intended to show how exhausting the available threads from the operating system will ultimately result in a program crash. Most systems with this problem would be bursty and would not create such a sustained load. Still, we need a better way to handle our threads for such a scenario.

This week, we released a thread pool executor implementation. Simply include the aws/core/utils/threading/Executor.h file. The class name is PooledThreadExecutor. You can set two options: the number of threads for the pool to use and the overflow policy.

Currently, there are two overflow policy modes:

QUEUE_TASKS_EVENLY_ACROSS_THREADS will allow you to push as many tasks as you want to the executor. It will make sure tasks are queued and pulled by each thread as quickly as possible. For most cases, QUEUE_TASKS_EVENLY_ACROSS_THREADS is the preferred option.

REJECT_IMMEDIATELY will reject the task submission if the queued task length ever exceeds the size of the thread pool.

Let’s revise our example to use a thread pool:


#include <aws/kinesis/model/PutRecordsRequest.h>
#include <aws/kinesis/KinesisClient.h>
#include <aws/core/utils/Outcome.h>
#include <aws/core/utils/memory/AWSMemory.h>
#include <aws/core/utils/threading/Executor.h>

using namespace Aws::Client;
using namespace Aws::Kinesis;
using namespace Aws::Kinesis::Model;

class KinesisProducer
{
public:
    KinesisProducer(const Aws::String& streamName, const Aws::String& partition) : m_partition(partition), m_streamName(streamName)
    {
        ClientConfiguration clientConfiguration;
        clientConfiguration.executor = Aws::MakeShared<PooledThreadExecutor>("kinesis-sample", 10);
        m_client = Aws::New<KinesisClient>("kinesis-sample", clientConfiguration);
    }

    ....

The only change we need to make to add the thread pool to our configuration is to assign an instance of the new executor implementation to our ClientConfiguration object.

As always, we welcome your feedback –and even pull requests– about how we can improve this feature.