AWS Official Blog
Many AWS customers use tags (key/value pairs) to organize their AWS resources. A recent Reddit thread (Share with us your AWS Tagging standards) provides a nice glimpse into some popular tagging strategies and practices.
Late last year we launched Resource Groups and Tag Editor. We gave you the ability to use Resource Groups to create, maintain, and view collections of AWS resources that share common tags. We also gave you the Tag Editor to simplify and streamline the process of finding and tagging AWS resources.
Today we are enhancing the tag search model that you use to create Resource Groups and to edit tags with the addition of substring search. If you encode multiple pieces of information within a single value, this feature can be very helpful. For example, you can locate resources that are tagged according to a pattern of the form “SystemDB-Dev-01-jeff” by searching for “Dev” like this:
The Tag Editor now allows you to use “deep links” that allow you to find a particular set of resources by clicking on a specially constructed link. Here are a couple of examples:
- Find all taggable resources (all resource types in all regions).
- Find all resources tagged with “Name”
You can perform a similar substring search when you are creating a Resource Group:
Again, you can use deep links to find resources. Here’s an example:
This feature is available now and you can start using it today.
Following up on his recent guest post, my colleague Peter Moon has more news for Go developers!
Since our initial kickoff announcement in January, we have been revamping the internals of the AWS SDK for Go in the project’s ‘develop’ branch on GitHub, laying out a solid foundation for a well-tested, robustly generated SDK that meets the same high quality bar as our other official SDKs.
Today, with complete support for all AWS protocols and services, the develop branch has been merged to the master branch of the project. At this point the SDK’s architecture and interfaces include the initial set of key changes we have envisioned, and we’re excited to announce our progress and humbly invite customers to try out the SDK again.
While collecting and responding to your valuable feedback, we will also continue to work on additional improvements including various usability features and better documentation. We are immensely grateful for the amount of engagement and support we’ve been getting from the community and look forward to continue making AWS a better place for Go developers!
— Peter Moon, Senior Product Manager
Many AWS customers use Amazon EMR to process huge amounts of data. Built around Hadoop, EMR allows these customers to build highly scalable processing systems that can quickly and efficiently digest raw data and turn it into actionable business intelligence.
EMR File System (EMRFS) enables Amazon EMR clusters to operate directly on data in Amazon Simple Storage Service (S3), making it easy for customers to work with input and output files in S3. Until now, EMRFS supported unencrypted and server-side encrypted objects in S3
Support for Amazon S3 Client-Side Encryption in the EMRFS
Today we’re adding support for client-side encrypted objects in S3, enabling you to use your own keys. The EMRFS S3 client-side encryption uses the same envelope encryption method found in the generic S3 Encryption Client, allowing you to use Amazon EMR to easily process data uploaded to S3 using that client. This feature does not, however, encrypt data stored in HDFS on the local disks of your Amazon EMR cluster or data in transit between your cluster nodes.
The encryption is transparent to the applications running on the EMR cluster.
You can store your keys in the AWS Key Management Service (KMS) or provide custom logic to access keys in on-premises HSMs or other customer key management systems. Amazon EMR can use an Encryption Materials Provider that you supply, so you can store your keys in any location where Amazon EMR can use them.
Enabling Encryption From the Console
You can enable this new feature from the EMR Console like this:
Based on the option that you select, the console will prompt you for additional information. For example, if you choose to use the Key Management Service, you can choose the desired one from the menu (you can also enter the ARN of an AWS KMS key if the key is owned by another AWS account):
Custom Key Management With the EMRFS
You can create a custom Encryption Materials Provider class to provide keys to the EMRFS using user defined logic. The EMRFS will pass information from the S3 object metadata to the provider to inform which key to retrieve for decryption. Your code must contain the information about how to retrieve the keys, and the EMRFS will use the key that the provider presents. When you specify the custom encryption materials provider option, all you need to do is give the Amazon S3 location of your provider, and Amazon EMR will automatically add the provider to the cluster and use it with the EMRFS.
This feature is available now and you can start using it today. You will need to use the latest EMR AMI (version 3.6.0 or later).
We are making some improvements to Amazon CloudFront‘s reporting feature. These improvements will allow you to learn even more about how and where your content is being accessed, export your data for additional analysis, and easily monitor and set alarms on a set of six metrics that CloudFront publishes to CloudWatch.
Let’s take a look at each of these new features!
New Devices Report
This report provides information about the types of devices that make requests to CloudFront during a specified time period:
You can access this new report via the CloudFront Console. Simply select Viewers under Reports and Analytics and then click on Devices.
CSV Data Export
You can now export the data contained in the various Reports & Analytics charts to a CSV file. Simply click on the CSV button:
CloudFront Metric and Alarms
You can now view CloudFront’s CloudWatch metrics directly (you no longer need to go the CloudWatch Console). You can also click on the Create Alarm button to create an alarm for any desired metric:
Better Popular Object URL Data
On March 13th we increased the number of characters in the Popular Object URLs report from 50 characters to 500 characters. If you view or download this report for date ranges that start before March 13th, you’ll see up to 50 characters. For date ranges that start on or after March 13th, you’ll now see up to 500 characters:
The CloudFront Console now saves your settings (selected parameters and UI preferences) to your local browser so that they remain available when you switch tabs or log out.
There’s now a What’s New link on the Console so that you can easily learn about newly launched features, upcoming webinars, and other announcements.
You can learn about more Amazon CloudFront Reports by visiting the Amazon CloudFront Reports & Analytics page.
CloudFront Office Hours
The CloudFront team will be holding office hours on March 26th at 10:00 AM PDT. Visit the CloudFront Webinars page to sign up and to learn more about other online events that may be of interest to you.
We launched Amazon S3 nine years ago as of last week!
Since that time we have added dozens of features, expanded across the globe, and reduced the prices for storage and bandwidth multiple times. You, our customers, have trusted us with your mission-critical data and have used S3 in thousands of interesting and unique ways. Your creativity and your feedback (keep it coming) have given us the insights that we need to have in order to ensure that S3 continues to meet your requirements for object storage.
While the name space for buckets is global, S3 (like most of the other AWS services) runs in each AWS region (see the AWS Global Infrastructure page for more information). This model gives you full control over the location of your data; you can choose an appropriate location based on local regulatory requirements, a desire to have the data close to your principal customers to reduce latency, or for other reasons.
Many of you have told us that you need to keep copies of your critical data in locations that are hundreds of miles apart. This is often a consequence of having to comply with stringent regulatory requirements for the storage of sensitive financial and personal data.
In order to make it easier for you to make copies of your S3 objects in a second AWS region, we are launching Cross-Region Replication today. You can use this feature to meet all of the needs that I described above including geographically diverse replication and adjacency to important customers.
Once enabled, every object uploaded to a particular S3 bucket is automatically replicated to a designated destination bucket located in a different AWS region.
You can enable and start using this feature in a couple of minutes! It it built on top of S3’s existing versioning facility; the console will help you to turn it on if necessary:
With versioning enabled, the rest is easy. You simply choose the destination region and bucket (and optionally restrict replication to a subset of the objects in the bucket using a prefix), set up an IAM role, and you are done.
You can choose an existing bucket or you can create a new one as part of this step:
You will also need to set up an IAM role so that S3 can list and retrieve objects from the source bucket and to initiate replication operations on the destination bucket. Because you have the opportunity to control the policy document, you can easily implement advanced scenarios such as replication between buckets owned by separate AWS accounts. The console will help you to set up the proper IAM role by supplying a default policy:
Once I had the replication all set up, I inspected the destination bucket. As expected, it was empty (replication works on newly created objects):
I uploaded a picture, and selected Reduced Redundancy Storage (RRS) and Server Side Encryption (SSE) using the AWS S3 master key:
I refreshed my view of the destination bucket a couple of times (I’m impatient) and the object was there, as expected. I verified that the replica also used RRS and SSE:
The replication process also copies any metadata and ACLs (Access Control Lists) associated with the object.
You can also enable and manage this feature through the S3 API.
A Few Details
Here are a few things to keep in mind as you start to think about how to make use of Cross-Region Replication in your own operating environment.
Versioning – As I mentioned earlier, you must first enable S3 versioning for the source and destination buckets.
Determining Replication Status – You (or your code) can use the HEAD operation on a source object to determine its replication status. You can also (as you saw above) view this status in the Console.
Region-to-Region – Replication always takes place between a pair of AWS regions. You cannot use this feature to replicate content to two buckets that are in the same region.
New Objects – Because this feature watches the source bucket for changes, it replicates new objects and changes to existing objects. If you need to replicate existing objects, a solution built around the S3 COPY operation can be used to bring the destination bucket up to date.
To learn more, read about Cross-Region Replication in the S3 Developer Guide.
This feature is available now and you can start using it today. In addition to the additional data storage charges for the data in the destination bucket, you will also pay the usual AWS price for data transfer between regions. For more information, please consult the S3 Pricing page.
We release new versions of the Amazon Linux AMI every six months after a public testing phase that includes one or more Release Candidates. The Release Candidates are announced in the EC2 forum.
Launching 2015.03 Today
Today we are releasing the 2015.03 Amazon Linux AMI for use in PV and HVM mode, with support for EBS-backed and Instance Store-backed AMIs.
This AMI uses kernel 3.14.35 and is available in all AWS regions.
You can launch this new version of the AMI in the usual ways. You can also upgrade an existing EC2 instance by running
sudo yum clean all sudo yum update
and then rebooting it.
The roadmap for the Amazon Linux AMI is driven in large part by customer requests. During this release cycle, we have added a number of features as a result of these requests; here’s a sampling:
- Python 2.7 is now the default for core system packages, including yum and cloud-init; versions 2.6 and 3.4 are also available in the repositories as python26 and python34, respectively.
- The nvidia package (required when you run the appropriate AMI on a G2 instance), is now DKMS-enabled. Updating to a new kernel will trigger a nvidia module rebuild for both the running kernel and the newly installed kernel.
- Ruby 2.2 is now available in the repositories as ruby22; Ruby 2.0 is still the default.
- PHP 5.6 is now available in the repositories as php56; it can run side-by-side with PHP 5.5.
- Docker 1.5 is now included in the repositories.
- Puppet 3.7 is now included. The Puppet 2 and Puppet 3 packages conflict with each other and cannot be installed at the same time.
The release notes contain a longer discussion of the new features and updated packages.
Things to Know
As we have discussed in the past, we are no longer producing new 32-bit AMIs. We are still producing 32-bit packages for customers that are still using the 2014.09 and earlier AMIs. We recommend the use of 64-bit AMIs for new projects.
We are no longer producing new “GPU” AMIs for the CG1 instance type. Once again, package updates are available and the G2 instance type should be used for new projects.
AWS Marketplace makes it easy for you to find, buy, and quickly start using a wide variety of software products from developers located all over the world:
Open in Germany
Today we are making AWS Marketplace available to users of our new Europe (Frankfurt) region. If you are using this region you can make use of over 700 products right now, with more on the way. The AWS Frankfurt region is fully compliant with all applicable EU Data Protection laws, so you can use AWS Marketplace software without data compliance concerns.
You can run popular security products, business intelligence solutions, storage software, and data products. You can also run hundreds of open source titles. Products are priced on an hourly basis so that you can get started with no upfront commitment. There’s also an annual priced option that can be even more economical if you have steady-state workloads. We’ll be adding monthly priced listings on May 1.
To find products of interest to you, simply visit AWS Marketplace, enter a search term, and then select the EU (Frankfurt) region on the left:
Inspect the results, find the desired product, and click on it to initiate the purchasing process:
If you are a software vendor or developer and would like to list your products in AWS Marketplace, please take a look at the Sell on Marketplace information. Customers will be able to launch your customers in minutes and pay for it as part of the regular AWS billing system. As a vendor of products that are available in AWS Marketplace, you will be able to discover new customers and benefit from a shorter sales cycle. You also have the opportunity to offer free trials of your product with no additional engineering effort.
We announced the AWS Managed Service Partner program at last year’s AWS re:Invent. We created the program in order to help our customers to find AWS Partner Network (APN) Partners who can deliver managed services in the cloud. In order for an APN Partner to become an approved member of this program, the quality of their offering must meet a high bar and they must pass an independent audit of their AWS Managed Service capability.
Today we welcome the first six APN Consulting Partners into the AWS MSP Program:
To learn more about these partners and the auditing process, please read the new post, Announcing Our Inaugural AWS Managed Service Program Partners on the AWS Partner Network Blog.
Let’s take a quick look at what happened in AWS-land last week:
- Tuesday, March 24 – Webinar – Microsoft Exchange Server 2013 on AWS.
- Wednesday, March 25 – Webinar – Amazon RDS for SQL Server.
- Thursday, March 26 – Webinar – Migrating Windows Server 2003 applications to Windows Server 2008/2012 on AWS with APN Partner AppZero.
- Thursday, March 26 – Webinar – – Saving at Scale: How Adobe Manages a Massive Reserved Instance Portfolio – with APN Partner Cloudability and customer Adobe.
- Monday, March 30 – Live Event (San Francisco) – IoT Hack Day: AWS Pop-up Loft Hack Series – Sponsored by Spark.
- Monday, March 30 – Webinar – Getting to 1.5M Ads/Sec: How DataXu Manages Big Data – with APN Partner Qubole and customer DataXu.
- Tuesday, March 31 – Webinar – Cloud E-Discovery: Solving the Right Problems with a Modern Approach – with APN Partner Zapproved and customer Raytheon.
- Thursday, April 2 – Webinar – Strategies for Securing Hybrid Workloads with Level 3 and Cisco.
- Wed, April 8 @ 10 am PT – Webinar – Refining Raw Data for Complete Customer Insight with Amazon Redshift and Pentaho – with APN Partner Pentaho and customer Lucky Group
- AWS Summits.
- April 12-16 – Live Event (Chicago, Illinois) – AWS at HIMSS15.
- April 13-16 – Live Event (Las Vegas, Nevada) – AWS at NAB 2015.
- Friday, April 17 – Webinar – The Power of Automated Cloud DR – With APN Partner CloudVelox.
- April 21-23 – Live Event (Boston, Massachusetts) – AWS at Bio-IT World 2015.
- AWS Summits.
My colleague Jed Sundwall wrote the guest post below to show you how one of the newest AWS Public Data Sets is being put to use.
You can now access over 85,000 Landsat 8 scenes through our newest Public Data Set: Landsat on AWS. The scenes are all available in the
landsat-pdsbucket in the Amazon S3 US West (Oregon) region.
Landsat is an earth observation program conducted in partnership by the U.S. Geological Survey (USGS) and NASA that creates moderate-resolution satellite imagery of all land on Earth every 16 days. The Landsat program has been running since 1972 and is the longest ongoing project to collect such imagery. Landsat 8 is the newest Landsat satellite and it gathers data based on visible, infrared, near-infrared, and thermal-infrared light.
Because of Landsat’s global purview and long history, it has become a reference point for all Earth observation work and is considered the gold standard of natural resource satellite imagery. It is the basis for research and applications in many global sectors, including agriculture, cartography, geology, forestry, regional planning, surveillance and education. Many of our customers’ work couldn’t be done without Landsat.
As we said in December, we hope to accelerate innovation in climate research, humanitarian relief, and disaster preparedness efforts around the world by making Landsat data readily available near our flexible computing resources. We have committed to host up to a petabyte of Landsat data as a contribution to the White House’s Climate Data Initiative. Because the imagery is available on AWS, researchers and software developers can use any of our on-demand services to perform analysis and create new products without needing to worry about storage or bandwidth costs.
You can learn more about how to access the data on our Landsat on AWS page.
What’s possible with Landsat on AWS
We’ve been testing our approach to hosting Landsat imagery over the past few months and have been amazed by what people have been able to do with it.
Development Seed has updated the popular open source landsat-util library to use data from Landsat on AWS. Now developers who rely on landsat-util can access Landsat data more quickly and with more processing options. Learn more about the updates to landsat-util. Here’s a screen shot of their Libra image browser:
Esri has created a demonstration of how ArcGIS Online can quickly visualize Landsat data for visualization and analysis within the browser. Visit Esri’s site to see how powerful and beautiful Landsat data can be.
Mapbox is using Landsat on AWS to power Landsat-live, a map that is constantly refreshed with the latest imagery from NASA’s Landsat 8 satellite. Learn more about Landsat-live. This map, created by Mapbox and named “Landsat Live”, offers the freshest Landsat imagery possible on a global level. Mapbox street data is overlaid on top to show as much context as possible:
MathWorks has created a freely-downloadable tool for accessing, processing, and visualizing Landsat data in MATLAB. With this tool, you can create a map display of scene locations with markers that show each scene’s metadata. Learn more about the tool and watch a demo video of it on the MathWorks blog.
Planet Labs uses Landsat data for image rectification and as a reference point for its own Earth observing satellites. Learn how Planet Labs uses Landsat on AWS to quickly create better products for its customers.
Left: a Landsat image of the Lower Se San 2 Dam in Cambodia taken on December 22, 2014. Right: A Planet Labs image of the dam taken less than a month later on January 14, 2015.
Accessing the Landsat Data
Rather than hosting each Landsat scene as a .tar archive that contains each of the scene’s 12 bands and metadata, we make each band of each scene is available as a stand-alone GeoTIFF and the scene’s metadata is hosted as a text file as well as a JSON file.
The data are organized using a directory structure based on each scene’s path and row. For instance, the files for Landsat scene LC80030172015001LGN00 are available in the following location:
The “L8” directory refers to Landsat 8, “003” refers to the scene’s path, “017” refers to the scene’s row, and the final directory matches the scene’s identifier. This identifier takes the form LXSPPPRRRYYYYDDDGSIVV and is segmented as follows:
- L = Landsat
- X = Sensor
- S = Satellite
- PPP = WRS path
- RRR = WRS row
- YYYY = Year
- DDD = Julian day of year
- GSI = Ground station identifier
- VV = Archive version number
In this case, the scene corresponds to WRS path 003, WRS row 017, and was taken on the 1st day of 2015.
Each scene’s directory includes:
.TIFGeoTIFF for each of the scene’s bands (the GeoTIFFs include 512×512 internal tiling and there can be up to 12 bands),
.TIF.ovroverview file for each .TIF (useful in GDAL based applications),
- A small RGB preview JPEG (3% of the original size),
- A large RGB preview JPEG (15% of the original size),
index.htmlfile that can be viewed in a browser to see the RGB preview, and
- Links to the GeoTIFFs and metadata files.
For instance, the files associated with scene LC80030172015001LGN00 are available at:
A gzipped CSV describing all available scenes is available at:
If you use the AWS Command Line Interface (CLI), you can access the bucket with this simple shell command:
$ aws s3 ls landsat-pds
We’d like to thank our customers at Development Seed, Esri, Mapbox, MathWorks, and Planet Labs who helped us launch and test this public data set. New collaborators are welcome to contribute to the scripts we use to acquire and process Landsat data on GitHub:
Collaborators are welcome to contribute to these scripts on GitHub.
— Jed Sundwall, Open Data Technical Business Manager