Category: AWS re:Invent
File Interface to AWS Storage Gateway
I should probably have a blog category for “catching up from AWS re:Invent!” Last November we made a really important addition to the AWS Storage Gateway that I was too busy to research and write about at the time.
As a reminder, the Storage Gateway is a multi-protocol storage appliance that fits in between your existing applications and the AWS Cloud. Your applications and your client operating systems see the gateway as (depending on the configuration), a file server, a local disk volume, or a virtual tape library (VTL). Behind the scenes, the gateway uses Amazon Simple Storage Service (S3) for cost-effective, durable, and secure storage. Storage Gateway caches data locally and uses bandwidth management to optimize data transfers.
Storage Gateway is delivered as a self-contained virtual appliance that is easy to install, configure, and run (read the Storage Gateway User Guide to learn more). It allows you to take advantage of the scale, durability, and cost benefits of cloud storage from your existing environment. It reduces the process of moving existing files and directories into S3 to a simple drag and drop (or a CLI-based copy).
As is the case with many AWS services, the Storage Gateway has gained many features since we first launched it in 2012 (The AWS Storage Gateway – Integrate Your Existing On-Premises Applications with AWS Cloud Storage). At launch, the Storage Gateway allowed you to create storage volumes and to attach them as iSCSI devices, with options to store either the entire volume or a cache of the most frequently accessed data in the gateway, all backed by S3. Later, we added support for Virtual Tape Libraries (Create a Virtual Tape Library Using the AWS Storage Gateway). Earlier this year we added read-only file shares, user permission squashing, and scanning for added and removed objects.
New File Interface
At AWS re:Invent we launched a third option, and that’s what I’d like to tell you about today. You can now use the Storage Gateway as a virtual file server that you can mount on your on-premises servers and desktops. After you set it up in your data center or in the cloud, your configured buckets will be available as NFS mount points. Your application simply reads and writes files and directories over NFS; behind the scenes, the gateway turns these operations into object-level requests on your S3 buckets, where they are accessible natively (one S3 object per file). To create a file gateway, you simply visit the Storage Gateway Console, click on Get started, and choose File gateway:

Then choose your host platform: VMware ESXi or Amazon EC2:

I expect many of our customers to host the Storage Gateway on premises and to use it as a permanent or temporary bridge to the cloud. Use cases for this option include simplified backups, migration, archiving, analytics, storage tiering, and compute-intensive cloud-based processing. Once the data is in the cloud, you can take advantage of many features of S3 including multiple storage tiers (Infrequent Access and Glacier are great for archiving), storage analytics, tagging, and the like.
I don’t have much data on-premises so I’m going to run the Storage Gateway on an EC2 instance for this post. I launched the instance and set it up per the instructions on the screen, taking care to create the proper inbound security group rules (port 80 for HTTP access and port 2049 for NFS). I added 150 GiB of General Purpose SSD storage to be used as a cache:

After the instance launched I captured its public IP address and used it to connect to my newly launched gateway:

I set the time zone and assigned a name to my gateway and clicked on Activate gateway:

Then I configured the local storage as a cache, and clicked on Save and continue:

My gateway was up and running, and I could see it in the console:

Next, I clicked on Create file share to create an NFS share and associate it with an S3 bucket:

As you can see, I had the opportunity to choose my storage class (Standard or Standard – Infrequent Access in accord with my needs and my use case). The gateway needs to be able to upload files into my bucket; clicking on Create a new IAM role will create a role and a policy (read Granting Access to an Amazon S3 Destination to learn more).
I review my settings and click on Create file share:

By the way, Root squash is a feature of the AWS Storage Gateway, not a vegetable. When enabled (as it is by default) files that arrive as owned by root (user id 0) are mapped to user id 65534 (traditionally known as nobody). I can also set up default permissions for new files and new directories.
My new share is visible in the console, and available for use within seconds:

The console displays the appropriate mount commands for Linux, Microsoft Windows, and macOS. Those commands use the private IP address of the instance; in many cases you will want to use the public address instead (needless to say, you should exercise extreme care when you create a public NFS share, and maintain close control over the IP addresses that are allowed to connect).
I flipped over to the S3 console and inspected the bucket (jbarr-gw-1), finding it empty, as expected:

Then I turned to my EC2 instance, mounted the share, and copied some files to it:

I returned to the console and found a new folder (jeff_code) in my bucket, as expected. I ventured inside and found the files that I had copied to the share:

As you can see, my files are copied directly into S3 and are simply regular S3 objects. This means that I can use my existing S3 tools, code, and analytics to process them. For example:
- Analytics – The new S3 metrics and analytics can be used to analyze the entire bucket or any directory tree within it:

- Code – AWS Lambda and Amazon Rekognition can be used to process uploaded images; see Serverless Photo Recognition for some ideas and some code. I could also use Amazon Elasticsearch Service to index some or all of the files or Amazon EMR to process massive amounts of data.
- Tools – I can process the existing objects in the bucket and I can also create new ones using the the S3 APIs. Any code or script that creates or removes should call the RefreshCache function to synchronize the contents of any gateways attached to the bucket (I can create a multi-site data distribution workflow by pointing multiple read-only gateways at the same bucket). I can also make use of existing, file-centric backup tools by using the share as the destination for my backups.
The gateway stores all of the file metadata (owner, group, permissions, and so forth) as S3 metadata:

Storage Gateway Resources
Here are some resources that will help you to learn more about the Storage Gateway:
Presentation – Deep Dive on the AWS Storage Gateway:
White Paper – File Gateway for Hybrid Architectures – Overview and Best Practices:
Recent Videos:
- Deep Dive on the AWS Storage Gateway – AWS Online Tech Talk.
- Introducing the New AWS Storage Gateway – re:Invent 2016.
- Using the AWS Storage Gateway Virtual Tape Library with Veritas Backup Exec.
- Getting Started with the Hybrid Cloud: Enterprise Backup and Recovery – re:Invent 2016.
Available Now
This cool AWS feature has been available since last November!
— Jeff;
New – AWS OpsWorks for Chef Automate
AWS OpsWorks helps you to configure and run applications using Chef. You use a Domain Specific Language (DSL) to write cookbooks that define your application’s architecture and the configuration of each component. The Chef server is an essential part of the configuration process. It stores all of the cookbooks and tracks state information for each of the instances (nodes in Chef terminology).
Because the Chef server is in the critical path when newly launched instances are configured, it must be reliable. Many OpsWorks and Chef users install and maintain this important architectural component themselves. In production-scale environments, this leaves them to handle backups, restores, version upgrades, and so forth.
New AWS OpsWorks for Chef Automate
Early this month we launched AWS OpsWorks for Chef Automate from the AWS re:Invent stage. You can launch the Chef Automate server with just 3 clicks and start using it within minutes. You can use community cookbooks from Chef Supermarket and community tools such as Test Kitchen and Knife.
You can use Chef Automate to manage your infrastructure throughout the life-cycle of your application’s infrastructure. For example, newly launched EC2 instances can automatically connect to the Chef server and run a specified recipe by using an unattended association script (read Adding Nodes Automatically in AWS OpsWorks for Chef Automate to learn more). The registration script can be used to register EC2 instances created dynamically through an Auto Scaling Group and to register on-premises servers.
Take a Look
Let’s launch a Chef Automate server from the OpsWorks Console. Click on Go to OpsWorks for Chef Automate to get started.

Click on Create Chef Automate server, give your server a name, choose a region, and select a suitable EC2 instance type:

Choose one of your SSH key pairs, or opt out of SSH:

Finally, configure your network (VPC), IAM, maintenance window, and backup settings:

Click on Next, review your settings, and then click on Launch! The launch process takes less than 20 minutes. During that time you can download the sign-in credentials for your Chef Automate dashboard along with a Starter Kit:

You can see all of your Chef Automate servers at a glance:

Click on the server name (BorkBorkBork here), and then on Open Chef Automate dashboard, then enter your credentials to log in:

And here’s the dashboard:

You can see and manage your nodes:

Manage your workflows:

And much more!
Behind the scenes, the launch process invokes a AWS CloudFormation template. The template creates an EC2 instance, an Elastic IP Address, and a Security Group.
Available Now
You can launch AWS OpsWorks for Chef Automate today in the US East (Northern Virginia), US West (Oregon), and EU (Ireland) Regions. Pricing is based on the number of nodes and the number of hours that they are connected to the server; see the Chef Automate Pricing page for more info. As part of the AWS Free Tier, you can use up to 10 nodes at no charge for 12 months.
— Jeff;
Amazon EFS Update – On-Premises Access via Direct Connect
I introduced you to Amazon Elastic File System last year (Amazon Elastic File System – Shared File Storage for Amazon EC2) and announced production readiness earlier this year (Amazon Elastic File System – Production-Ready in Three Regions). Since the launch earlier this year, thousands of AWS customers have used it to set up, scale, and operate shared file storage in the cloud.
Today we are making EFS even more useful with the introduction of simple and reliable on-premises access via AWS Direct Connect. This has been a much-requested feature and I know that it will be useful for migration, cloudbursting, and backup. To use this feature for migration, you simply attach an EFS file system to your on-premises servers, copy your data to it, and then process it in the cloud as desired, leaving your data in AWS for the long term. For cloudbursting, you would copy on-premises data to an EFS file system, analyze it at high speed using a fleet of Amazon Elastic Compute Cloud (EC2) instances, and then copy the results back on-premises or visualize them in Amazon QuickSight.
You’ll get the same file system access semantics including strong consistency and file locking, whether you access your EFS file systems from your on-premises servers or from your EC2 instances (of course, you can do both concurrently). You will also be able to enjoy the same multi-AZ availability and durability that is part-and-parcel of EFS.
In order to take advantage of this new feature, you will need to use Direct Connect to set up a dedicated network connection between your on-premises data center and an Amazon Virtual Private Cloud. Then you need to make sure that your filesystems have mount targets in subnets that are reachable via the Direct Connect connection:

You also need to add a rule to the mount target’s security group in order to allow inbound TCP and UDP traffic to port 2049 (NFS) from your on-premises servers:

After you create the file system, you can reference the mount targets by their IP addresses, NFS-mount them on-premises, and start copying files. The IP addresses are available from within the AWS Management Console:

The Management Console also provides you with access to step-by-step directions! Simply click on the On-premises mount instructions:

And follow along:

This feature is available today at no extra charge in the US East (Northern Virginia), US West (Oregon), EU (Ireland), and US East (Ohio) Regions.
— Jeff;
Amazon AppStream 2.0 – Stream Desktop Apps from AWS
My colleague Gene Farrell wrote the guest post below to tell you how the original vision for Amazon AppStream evolved in the face of customer feedback.
— Jeff;
At AWS, helping our customers solve problems and serve their customers with technology is our mission. It drives our thinking, and it’s at the center of how we innovate. Our customers use services from AWS to build next-generation mobile apps, create delightful web experiences, and even run their core IT workloads, all at global scale.
While we have seen tremendous innovation and transformation in mobile, web, and core IT, relatively little has changed with desktops and desktop applications. End users don’t yet enjoy freedom in where and how they work; IT is stuck with rigid and expensive systems to manage desktops, applications, and a myriad of devices; and securing company information is harder than ever. In many ways, the cloud seems to have bypassed this aspect of IT.
Our customers want to change that. They want the same benefits of flexibility, scale, security, performance, and cost for desktops and applications as they’re seeing with mobile, web, and core IT. A little over two years ago, we introduced Amazon WorkSpaces, a fully managed, secure cloud desktop service that provides a persistent desktop running on AWS. Today, I am excited to introduce you to Amazon AppStream 2.0, a fully managed, secure application streaming service for delivering your desktop apps to web browsers.
Customers have told us that they have many traditional desktop applications that need to work on multiple platforms. Maintaining these applications is complicated and expensive, and customers are looking for a better solution. With AppStream 2.0, you can provide instant access to desktop applications using a web browser on any device, by streaming them from AWS. You don’t need to rewrite your applications for the cloud, and you only need to maintain a single version. Your applications and data remain secure on AWS, and the application stream is encrypted end to end.
Looking back at the original AppStream
Before I get into more details about AppStream 2.0, it’s worth looking at the history of the original Amazon AppStream service. We launched AppStream in 2013 as an SDK-based service that customers could use to build streaming experiences for their desktop apps, and move these apps to the cloud. We believed that the SDK approach would enable customers to integrate application streaming into their products. We thought game developers and graphics ISVs would embrace this development model, but it turns out it was more work than we anticipated, and required significant engineering investment to get started. Those who did try it, found that the feature set did not meet their needs. For example, AppStream only offered a single instance type based on the g2.2xlarge EC2 instance. This limited the service to high-end applications where performance would justify the cost. However, the economics didn’t make sense for a large number of applications.
With AppStream, we set out to solve a significant customer problem, but failed to get the solution right. This is a risk that we are willing to take at Amazon. We want to move quickly, explore areas where we can help customers, but be prepared for failure. When we fail, we learn and iterate fast. In this case, we continued to hear from customers that they needed a better solution for desktop applications, so we went back to the drawing board. The result is AppStream 2.0.
Benefits of AppStream 2.0
AppStream 2.0 addresses many of the concerns we heard from customers who tried the original AppStream service. Here are a few of the benefits:
- Run desktop applications securely on any device in an HTML5 web browser on Windows and Linux PCs, Macs, and Chromebooks.
- Instant-on access to desktop applications from wherever users are. There are no delays, no large files to download, and no time-consuming installations. Users get a responsive, fluid experience that is just like running natively installed apps.
- Simple end user interface so users can run in full screen mode, open multiple applications within a browser tab, and easily switch and interact between them. You can upload files to a session, access and edit them, and download them when you’re done. You can also print, listen to audio, and adjust bandwidth to optimize for your network conditions.
- Secure applications and data that remain on AWS – only encrypted pixels are streamed to end users. Application streams and user input flow through a secure streaming gateway on AWS over HTTPS, making them firewall friendly. Applications can run inside your own virtual private cloud (VPC), and you can use Amazon VPC security features to control access. AppStream 2.0 supports identity federation, which allows your users to access their applications using their corporate credentials.
- Fully managed service, so you don’t need to plan, deploy, manage, or upgrade any application streaming infrastructure. AppStream 2.0 manages the AWS resources required to host and run your applications, scales automatically, and provides access to your end users on demand.
- Consistent, scalable performance on AWS, with access to compute capabilities not typically available on local devices. You can instantly scale locally and globally, and ensure that your users always get a low-latency experience.
- Multiple streaming instance types to run your applications. You can use instance types from the General Purpose, Compute Optimized, and Memory Optimized instance families to optimize application performance and reduce your overall costs.
- NICE DCV for high-performance streaming provides secure, high-performance access to applications. NICE DCV delivers a fluid interactive experience, and automatically adjusts to network conditions.
Pricing & availability
With AppStream 2.0, you pay only for the streaming instances that you use, and a small monthly fee per authorized user. The charge for streaming instances depends on the instance type that you select, and the maximum number of concurrent users that will access their applications.
A user fee is charged per unique authorized user accessing applications in a region in any given month. The user fee covers the Microsoft RDS SAL license, and may be waived if you bring your own RDS CAL licenses via Microsoft’s license mobility program. AppStream 2.0 offers a Free Tier, which provides an admin experience for getting started. The Free Tier includes 40 hours per month, for up to two months. For more information, see this page.
AppStream 2.0 is available today in US East (N. Virginia), US West (Oregon), Europe (Ireland), and AP-Northeast (Tokyo) Regions. You can try the AppStream 2.0 end user experience for free today, with no setup required, by accessing sample applications already installed on AppStream 2.0 To access the Try It Now experience, log in with your AWS account and choose an app to get started.
To learn more about AppStream 2.0, visit the AppStream page.
— Gene Farrell, Vice President, AWS Enterprise Applications & EC2 Windows
Update: Sign up for the January 20th webinar now to learn more! Register here.
New – IPv6 Support for EC2 Instances in Virtual Private Clouds
The continued growth of the Internet, particularly in the areas of mobile applications, connected devices, and IoT, has spurred an industry-wide move to IPv6. In accord with a mandate that dates back to 2010, United States government agencies have been working to move their public-facing servers and services to IPv6 as quickly as possible. With 128 bits of address space, IPv6 has plenty of room for growth and also opens the door to new applications and new use cases.
IPv6 for EC2
Earlier this year we launched IPv6 support for S3 (including Transfer Acceleration), CloudFront, WAF, and Route 53. Today we are taking the next big step forward with the launch of IPv6 support for Virtual Private Cloud (VPC) and EC2 instances running in a VPC. This support is launching today in the US East (Ohio) Region and is in the works for the others.
IPv6 support works for new and existing VPCs; you can opt in on a VPC-by-VPC basis by simply checking a box on the Console (API and CLI support is also available):

Each VPC is given a unique /56 address prefix from within Amazon’s GUA (Global Unicast Address); you can assign a /64 address prefix to each subnet in your VPC:

As we did with S3, we make use of a dual-stack model that assigns each instance an IPv4 address and an IPv6 address, along with corresponding DNS entries. Support for both versions of the protocol ensures compatibility and flexibility to access resources and applications.
Security Groups, Route Tables, Network ACLs, VPC Peering, Internet Gateway, Direct Connect, VPC Flow Logs, and DNS resolution within a VPC all operate in the same way as today. Application Load Balancer support for the dual-stack model is on the near-term roadmap and I’ll let you know as soon as it is available.
IPv6 Support for Direct Connect
The Direct Connect Console lets you create virtual interfaces (VIFs) with your choice of IPv4 or IPv6 addresses:

Each VIF supports one BGP peering session over IPv4 and one BGP peering session over IPv6.
New Egress-Only Internet Gateway for IPv6
One of the interesting things about IPv6 is that every address is internet-routable and can talk to the Internet by default. In an IPv4-only VPC, assigning a public IP address to an EC2 instance sets up 1:1 NAT (Network Address Translation) to a private address that is associated with the instance. In a VPC where IPv6 is enabled, the address associated with the instance is public. This direct association removes a host of networking challenges, but it also means that you need another mechanism to create private subnets.
As part of today’s launch, we are introducing a new Egress-Only Internet Gateway (EGW) that you can use to implement private subnets for your VPCs. The EGW is easier to set up and to use than a fleet of NAT instances, and is available to you at no cost. It allows you to block incoming traffic while still allowing outbound traffic (think of it as an Internet Gateway mated to a Security Group). You can create an EGW in all of the usual ways, and use it to impose restrictions on inbound IPv6 traffic. You can continue to use NAT instances or NAT Gateways for IPv4 traffic.
Available Now
IPv6 support for EC2 is now available in the US East (Ohio) Region and you can start using it today at no extra charge. It works with all current-generation EC2 instance types with the exception of M3 and G2, and will be supported on upcoming instance types as well.
IPv6 support for other AWS Regions is in works and I’ll let you know (most likely via a tweet), just as soon as it is ready!
— Jeff;
New – AWS Step Functions – Build Distributed Applications Using Visual Workflows
We want to make it even easier for you to build complex, distributed applications by connecting multiple web and microservices. Whether you are implementing a complex business process or setting up a processing pipeline for photo uploads, we want you to focus on the code instead of on the coordination. We want you to be able to build reliable applications that are robust, scalable, and cost-effective, while you use the tools and libraries that you are already familiar with.
How does that sound?
Introducing AWS Step Functions
Today we are launching AWS Step Functions to allow you to do exactly what I described above. You can coordinate the components of your application as series of steps in a visual workflow. You create state machines in the Step Functions Console to specify and execute the steps of your application at scale.
Each state machine defines a set of states and the transitions between them. States can be activated sequentially or in parallel; Step Functions will make sure that all parallel states run to completion before moving forward. States perform work, make decisions, and control progress through the state machine.
Here’s a state machine that includes a little bit of everything:

Multiple copies of each state machine can be running independently at the same time; each copy is called an execution. Step Functions will let you run thousands of execution concurrently so you can scale to any desired level.
There are two different ways to specify what you want to happen when a state is run. First, you can supply a Lambda function that will be synchronously invoked when the state runs. Second, you can supply the name of an Activity. This is a reference to a long-running worker function that polls (via the API) for work to be done. Either way, the code is supplied with a JSON statement as input, and is expected to return another JSON statement as output.
As part of your state machine, you can specify error handling behavior and retry logic. This allows you to build robust multi-step apps that will run smoothly even if transient issues in one part of your code cause a momentary failure.
Quick Tour
Let’s set up a state machine through the AWS Management Console. Keep in mind that production applications will most likely use the AWS Step Functions API (described below) to create and run state machines.
I start by creating and saving a simple Lambda function:

While I am there I also capture the function’s ARN:

Then I go over to the AWS Step Functions Console and click on Create a State Machine. I enter a name (MyStateMachine), and I can click on one of the blueprints to get a running start:

I start with Hello World and use elements of Parallel to create this JSON model of my state machine (read the Amazon States Language spec to learn more):
{
"Comment": "A simple example of the Steps language using an AWS Lambda Function",
"StartAt": "Hello",
"States": {
"Hello": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:99999999999:function:HelloWord_Step",
"Next": "Parallel"
},
"Parallel": {
"Type": "Parallel",
"Next": "Goodbye",
"Branches": [
{
"StartAt": "p1",
"States": {
"p1": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:9999999999:function:HelloWord_Step",
"End": true
}
}
},
{
"StartAt": "p2",
"States": {
"p2": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:99999999999:function:HelloWord_Step",
"End": true
}
}
}
]
},
"Goodbye": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:99999999999:function:HelloWord_Step",
"End": true
}
}
}
I click on Preview to see it graphically:

Then I select the IAM role that Step Functions thoughtfully created for me:

And I am all set! Now I can execute my state machine from the console; I can start it off with a block of JSON that is passed to the first function:

The state machine starts to execute as soon as I click on Start Execution. I can follow along and watch as execution flows from state to state:

I can visit the Lambda Console and see that my function ran four times as expected (I was pressed for time and didn’t bother to create four separate functions):

AWS Step Functions records complete information about each step and I can access it from the Step Console:

AWS Step Functions API
As I mentioned earlier, most of your interaction with AWS Step Functions will happen through the APIs. Here’s a quick overview of the principal functions:
CreateStateMachine– Create a new state machine, given a JSON description.ListStateMachines– Get a list of state machines.StartExecution– Run (asynchronously) a state machine.DescribeExecution– Get information about an execution.GetActivityTask– Poll for new tasks to run (used by long-running workers).
You could arrange to run a Lambda function every time a new object is uploaded to an S3 bucket. This function can then kick off a state machine execution by calling StartExecution. The state machine could (as an example) validate the image, generate multiple sizes and formats in parallel, check for particular types of content, and update a database entry.
The same functionality is also available from the AWS Command Line Interface (CLI).
Development Tools
You can use our new statelint gem to check your hand or machine-generated JSON for common errors including unreachable states and the omission of a terminal state.
Download it from the AWS Labs GitHub repo ( it will also be available on RubyGems) and install it like this:
$ sudo gem install j2119-0.1.0.gem statelint-0.1.0.gem
Here’s what happens if you have a problem:
$ statelint my_state.json
2 errors:
State Machine.States.Goodbye does not have required field "Next"
No terminal state found in machine at State Machine.States
And if things look good:
$ statelint my_state.json
$
Available Now
AWS Step Functions is available now and you can start using it today in the US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), and Asia Pacific (Tokyo) Regions.
As part of the AWS Free Tier, you can perform up to 4,000 state transitions per month at no charge. After that, you pay $0.025 for ever 1,000 state transitions.
You can learn more during our webinar on December 16th. Register here.
— Jeff;
Lambda@Edge – Preview
Just last week, a comment that I made on Hacker News resulted in an interesting email from an AWS customer!
He told me that he runs a single page app that is hosted on S3 (read about this in Host Your Static Website on Amazon S3) and served up at low latency through Amazon CloudFront. The page includes some dynamic elements that are customized for each user via an API hosted on AWS Elastic Beanstalk.
Here’s how he explained his problem to me:
In order to properly get indexed by search engines and in order for previews of our content to show up correctly within Facebook and Twitter, we need to serve a prerendered version of each of our pages. In order to do this, every time a normal user hits our site need for them to be served our normal front end from Cloudfront. But if the user agent matches Google / Facebook / Twitter etc., we need to instead redirect them the prerendered version of the site.
Without spilling any beans I let him know that we were very aware of this use case and that we had some interesting solutions in the works. Other customers have also let us know that they want to customize their end user experience by making quick decisions out at the edge.
It turns out that there are many compelling use cases for “intelligent” processing of HTTP requests at a location that is close (latency-wise) to the customer. These include inspection and alteration of HTTP headers, access control (requiring certain cookies to be present), device detection, A/B testing, expedited or special handling for crawlers or ‘bots, and rewriting user-friendly URLs to accommodate legacy systems. Many of these use cases require more processing and decision-making than can be expressed by simple pattern matching and rules.
Lambda@Edge
In order to provide support for these use cases (and others that you will dream up), we are launching a preview of Lambda@Edge. This new Lambda-based processing model allows you to write JavaScript code that runs within the ever-growing network of AWS edge locations.
You can now write lightweight request processing logic that springs to life quickly and handles requests and responses that flow through a CloudFront distribution. You can run code in response to four distinct events:
Viewer Request – Your code will run on every request, whether the content is cached or not. Here’s some simple header processing code:
exports.viewer_request_handler = function(event, context) {
var headers = event.Records[0].cf.request.headers;
for (var header in headers) {
headers["X-".concat(header)] = headers[header];
}
context.succeed(event.Records[0].cf.request);
}
Origin Request – Your code will run when the requested content is not cached at the edge, before the request is passed along to the origin. You can add more headers, modify existing ones, or modify the URL.
Viewer Response – Your code will run on every response, cached or not. You could use this to clean up some headers that need not be passed back to the viewer.
Origin Response – Your code will run after a cache miss causes an origin fetch and returns a response to the edge.
Your code has access to many aspects of the requests and responses including the URL, method, HTTP version, client IP address, and headers. Initially, you will be able to add, delete, and modify the headers. Soon, you will have complete read/write access to all of the values including the body.
Because your JavaScript code will be part of the request/response path, it must be lean, mean, and self-contained. It cannot make calls to other web services and it cannot access other AWS resources. It must run within 128 MB of memory, and complete within 50 ms.
To get started, you will simply create a new Lambda function, set your distribution as the trigger, and choose the new Edge runtime:
Then you write your code as usual; Lambda will take care of the behind-the-scenes work of getting it to the edge locations.
Interested?
I believe that this cool new processing model will lead to the creation of some very cool new applications and development tools. I can’t wait to see what you come up with!
We are launching a limited preview of Lambda@Edge today and are taking applications now. If you have a relevant use case and are ready to try this out, please apply here.
You can find out more on December 16th by joining our webinar. Register here.
— Jeff;
New – AWS Personal Health Dashboard – Status You Can Relate To
We launched the AWS Service Health Dashboard way back in 2008! Back then, the AWS Cloud was relatively new, and the Service Health Dashboard was a good way for our customers to check on the status of each service (compare the simple screen shot in that blog post to today’s Service Health Dashboard to see how much AWS has grown in just 8 years).
While the current dashboard is good at displaying the overall status of each AWS service, it is actually impersonal. When you pay it a visit, you are probably more concerned about the status of the AWS services and resources that you are using than you are about the overall status of AWS.
New Personal Health Dashboard
In order to provide you with additional information that is of direct interest to you, we are launching the AWS Personal Health Dashboard today.
As the name indicates, this dashboard gives you a personalized view into the performance and availability of the AWS services that you are using, along with alerts that are automatically triggered by changes in the health of the services. It is designed to be the single source of truth with respect to your cloud resource, and should give you more visibility into any issues that might affect you.
You will see a notification icon in the Console menu when your dashboard contains an item of interest to you. Click on it to see a summary:

Clicking on Open issues displays issues that might affect your AWS infrastructure (this is all test data, by the way):

Clicking on an item will give you more information, including guidance on how to remediate the issue:

The dashboard also gives you a heads-up in advance of scheduled activities:

As well as other things that should be of interest to you:

But Wait, There’s More
You can also use CloudWatch Events to automate your response to alerts and notification of scheduled activities. For example, you could respond to a notification of an impending maintenance event on a critical EC2 instance by proactively moving to a fresh instance.
If your organization subscribes to AWS Business Support or AWS Enterprise Support, you also have access to the new AWS Health API. You can use this API to integrate your existing in-house or third-party IT Management tools with the information in the Personal Health Dashboard.
If you would like to learn even more, we have a webinar on January 17th! Register for it here.
— Jeff;
AWS Batch – Run Batch Computing Jobs on AWS
I entered college in the fall of 1978. The Computer Science department at Montgomery College was built around a powerful (for its time) IBM 370/168 mainframe. I quickly learned how to use the keypunch machine to prepare my card decks, prefacing the actual code with some cryptic Job Control Language (JCL) statements that set the job’s name & priority, and then invoked the FORTRAN, COBOL, or PL/I compiler. I would take the deck to the submission window, hand it to the operator in exchange for a job identifier, and then come back several hours later to collect the printed output and the card deck. I studied that printed output with care, and was always shocked to find that after my jobs spent several hours waiting for its turn to run, the actual run time was just a few seconds. As my fellow students and I quickly learned, jobs launched by the school’s IT department ran at priority 4 while ours ran at 8; their jobs took precedence over ours. The goal of the entire priority mechanism was to keep the expensive hardware fully occupied whenever possible. Student productivity was assuredly secondary to efficient use of resources.
Batch Computing Today
Today, batch computing remains important! Easier access to compute power has made movie studios, scientists, researchers, numerical analysts, and others with an insatiable appetite for compute cycles hungrier than ever. Many organizations have attempted to feed these needs by building in-house compute clusters powered by open source or commercial job schedulers. Once again, priorities come in to play and there never seems to be enough compute power to go around. Clusters are expensive to build and to maintain, and are often comprised of a large array of identical, undifferentiated processors, all of the same vintage and built to the same specifications.
We believe that cloud computing has the potential to change the batch computing model for the better, with fast access to many different types of EC2 instances, the ability to scale up and down in response to changing needs, and a pricing model that allows you to bid for capacity and to obtain it as economically as possible. In the past, many AWS customers have built their own batch processing systems using EC2 instances, containers, notifications, CloudWatch monitoring, and so forth. This turned out to be a very common AWS use case and we decided to make it even easier to achieve.
Introducing AWS Batch
Today I would like to tell you about a new set of fully-managed batch capabilities. AWS Batch allows batch administrators, developers, and users to have access to the power of the cloud without having to provision, manage, monitor, or maintain clusters. There’s nothing to buy and no software to install. AWS Batch takes care of the undifferentiated heavy lifting and allows you to run your container images and applications on a dynamically scaled set of EC2 instances. It is efficient, easy to use, and designed for the cloud, with the ability to run massively parallel jobs that take advantage of the elasticity and selection provided by Amazon EC2 and EC2 Spot and can easily and securely interact with other other AWS services such as Amazon S3, DynamoDB, and SNS.
Let’s start by taking a look at some important AWS Batch terms and concepts (if you are already doing batch computing, many of these terms will be familiar to you, and still apply). Here goes:
Job – A unit of work (a shell script, a Linux executable, or a container image) that you submit to AWS Batch. It has a name, and runs as a containerized app on EC2 using parameters that you specify in a Job Definition. Jobs can reference other jobs by name or by ID, and can be dependent on the successful completion of other jobs.
Job Definition – Specifies how Jobs are to be run. Includes an AWS Identity and Access Management (IAM) role to provide access to AWS resources, and also specifies both memory and CPU requirements. The definition can also control container properties, environment variables, and mount points. Many of the specifications in a Job Definition can be overridden by specifying new values when submitting individual Jobs.
Job Queue – Where Jobs reside until scheduled onto a Compute Environment. A priority value is associated with each queue.
Scheduler – Attached to a Job Queue, a Scheduler decides when, where, and how to run Jobs that have been submitted to a Job Queue. The AWS Batch Scheduler is FIFO-based, and is aware of dependencies between jobs. It enforces priorities, and runs jobs from higher-priority queues in preference to lower-priority ones when the queues share a common Compute Environment. The Scheduler also ensures that the jobs are run in a Compute Environment of an appropriate size.
Compute Environment – A set of managed or unmanaged compute resources that are used to run jobs. Managed environments allow you to specify desired instance types at several levels of detail. You can set up Compute Environments that use a particular type of instance, a particular model such as c4.2xlarge or m4.10xlarge, or simply specify that you want to use the newest instance types. You can also specify the minimum, desired, and maximum number of vCPUs for the environment, along with a percentage value for bids on the Spot Market and a target set of VPC subnets. Given these parameters and constraints, AWS Batch will efficiently launch, manage, and terminate EC2 instances as needed. You can also launch your own Compute Environments. In this case you are responsible for setting up and scaling the instances in an Amazon ECS cluster that AWS Batch will create for you.
A Quick Tour
You can access AWS Batch from the AWS Management Console, AWS Command Line Interface (CLI), or via the AWS Batch APIs. Let’s take a quick console tour!
The Status Dashboard displays my Jobs, Job Queues, and Compute Environments:

I need a place to run my Jobs, so I will start by selecting Compute environments and clicking on Create environment. I begin by choosing to create a Managed environment, give it a name, and choosing the IAM roles (these were created automatically for me):

Then I set up the provisioning model (On-Demand or Spot), choose the desired instance families (or specific types), and set the size of my Compute Environment (measured in vCPUs):

I wrap up by choosing my VPC, the desired subnets for compute resources, and the security group that will be associated with those resources:

I click on Create and my first Compute Environment (MainCompute) is ready within seconds:

Next, I need a Job Queue to feed work to my Compute Environment. I select Queues and click on Create Queue to set this up. I accept all of the defaults, connect the Job Queue to my new Compute Environment, and click on Create queue:

Again, it is available within seconds:

Now I can set up a Job Definition. I select Job definitions and click on Create, then set up my definition (this is a very simple job; I am sure you can do better). My job runs the sleep command, needs 1 vCPU, and fits into 128 MB of memory:

I can also pass in environment variables, disable privileged access, specify the user name for the process, and arrange to make file systems available within the container:

I click on Save and my Job Definition is ready to go:

Now I am ready to run my first Job! I select Jobs and click on Submit job:

I can also override many aspect of the job, add additional tags, and so forth. I’ll everything as-is and click on Submit:

And there it is:

I can also submit jobs by specifying the Ruby, Python, Node, or Bash script that implements the job. For example:

The command line equivalents to the operations that I used in the console include create-compute-environment, describe-compute-environments, create-job-queue, describe-job-queues, register-job-definition, submit-job, list-jobs, and describe-jobs.
I expect to see the AWS Batch APIs used in some interesting ways. For example, imagine a Lambda function that is invoked when a new object (a digital X-Ray, a batch of seismic observations, or a 3D scene description) is uploaded to an S3 bucket. The function can examine the object, extract some metadata, and then use the SubmitJob function to submit one or more Jobs to process the data, with updated data stored in Amazon DynamoDB and notifications sent to Amazon Simple Notification Service (SNS) along the way.
Pricing & Availability
AWS Batch is in Preview today in the US East (Northern Virginia) Region. In addition to regional expansion, we have many other interesting features on the near-term AWS Batch roadmap. For example, you will be able to use an AWS Lambda function as a Job.
There’s no charge for the use of AWS Batch; you pay only for the underlying AWS resources that you consume.
If you’d like to learn more we have a webinar coming December 12th. Register here.
— Jeff;
Amazon Pinpoint – Hit your Targets with AWS
My colleague Georgie Mathews wrote the guest post below to introduce you to Amazon Pinpoint, a new service that helps you to measure and improve user engagement for your mobile apps.
— Jeff;
Our mobile customers have told us how expensive it can get to acquire new users for their apps. Then there is the challenge of retaining those users and encouraging them to use the app frequently. To help keep users coming back, app companies run engagement campaigns using push notifications. These campaigns can vary depending on the app. For example, game developers may send users an in-app notification with new level hints and bonuses if they are stuck on one level for too long, or retailers send users promotional information in the event of a sale or if they haven’t opened the app recently.
Measuring and constantly improving targeted push notification campaigns is essential to increasing user engagement. Sending too many, or untimely notifications can cause users to turn them off or even uninstall the app. Push notification campaigns that are targeted based on app usage trends and user behavior increases message relevance and effectiveness, while helping app developers define and measure messaging benchmarks for campaigns.
Previously, if you wanted to engage users with targeted push notification campaigns, you either used a third-party service, or you had to build your own targeting solutions. Building your own in-house campaign management solution also meant you had to manage scalability, feature support, and maintenance.
Introducing Amazon Pinpoint
Today we are launching Amazon Pinpoint, a new service that makes it easy to run targeted campaigns to improve user engagement. Pinpoint helps you understand your users’ behavior, define who to target, what messages to send, when to deliver them, and tracks the results of the campaign.
Pinpoint enables real-time analytics with dashboards for analyzing user engagement, monetization, user demographics, custom events, and funnels so you can understand how users engage with your application. You can analyze and understand your user data by drilling down based on the segments you’ve defined, segmentation attributes, or time.
With Pinpoint, you can define target segments from a variety of different data sources. You can identify target segments from app user data collected in Pinpoint. You can build custom target segments from user data collected in other AWS services such as Amazon S3 and Amazon Redshift, and import target user segments from third party sources such as Salesforce via S3.
Once you define your segments, Pinpoint lets you send targeted notifications with personalized messages to each user in the campaign based on custom attributes such as game level, favorite team, and news preferences for example. Amazon Pinpoint can send push notifications immediately, at a time you define, or as a recurring campaign. By scheduling campaigns, you can optimize the push notifications to be delivered at a specific time across multiple time zones. For your marketing campaigns Pinpoint supports Rich Notifications to enable you to send images as part of your campaigns. We also support silent or data notifications which allow you to control app behavior and app config on the background.
Once your campaign is running, Amazon Pinpoint provides metrics to track the impact of your campaign, including the number of notifications received, number of times the app was opened as a result of the campaign, time of app open, push notification opt-out rate, and revenue generated from campaigns. You can also export the resulting event data and run custom analytics using your existing analytics systems. You can also A/B test different messages, track results, and then send the best message to your target segment.
With Pinpoint there is no minimum fee, no setup cost and no fixed monthly cost based on your total user pool. You only pay for the number of users you target or collect events from, the messages you send, and events you collect, so you can start small and scale as your application grows.
Now lets take a look at how Pinpoint makes it easy to setup a campaign.
Create a new Mobile Hub project from the AWS Mobile Hub console:

Choose Add User Engagement and enable app and campaign analytics by clicking Enable Engagement and add your GCM/FCM and APNS credentials.

See integration steps for User Engagement within the Integrate section of Mobile Hub.

Once you are completed with integration steps in Mobile Hub. Next, head over to the Pinpoint console where you will see your app live.

Click on Campaigns → Create Campaign:

Leave Standard Campaign selected and Click on Segment to define your targeting criteria:

Click on Message, type in a message and click Schedule:

Choose Immediate from the drop down, click Review and Launch and then finally Launch Campaign.
You can also view your app analytics with Pinpoint using the Pinpoint Analytics dashboard:

Pricing and Availability
We are launching Amazon Pinpoint today in the US East (Northern Virginia) Region, and plan to expand it to other regions in the near future. Let us know what you think!
Georgie Mathews, Senior Product Manager
Update- You can learn more about Pinpoint with our webinar on January 20th. Sign up for the webinar here.

