Category: Launch


Amazon Polly – Announcing Speech Marks and Whispering

by Tara Walker | on | in Amazon Polly, Launch | | Comments

Like me, you may have loved going to the library or bookstore to have your favorite book narrated to you. As a child, I loved listening to books narrated by good storytellers who gave life to their stories by changing the inflection of their voice as needed. The book narration coupled with the visual aids the storytellers used to tell the story, drove my love for reading and exploring new books.

In fact, in order for my parents to ensure that my love of reading extended to classic novels, they bought my sister and I, a small projector device with a tape recorder. This device would narrate the story and synchronize the projection of the visuals from the book by using a chime sound to signal when we should advance to the next screen. While I have unfortunately dated myself with that story, it is great for me to look back and consider how far we have come with speech technologies like Text-to-Speech (TTS). Even with all of these advancements, it is still challenging for developers to add synchronized speech/voice to the animations of characters or graphics in their games, videos, and digital books using TTS. Additionally, it is very rare to successfully use a TTS solution to emulate the pitch, tempo, and level of loudness of the speech in lifelike voices.
With this in mind, I am happy to announce Amazon Polly is launching support for Speech Marks and Whispering.

Amazon Polly is a deep learning service that enables you to turn text into lifelike speech. You can select a voice of your choice by taking advantage of the 47 lifelike voices included in the service and its support for 24 languages. Using Polly, you can send the text you want to convert into speech to the Polly API, and it will return an audio stream that you can play or store it in common audio file formats like MP3.

Speech Marks are metadata, which allows developers to synchronize speech with visual experiences. This feature enables scenarios like lip-syncing by synchronizing speech with facial animations or using the highlighting of written words as they are spoken. The speech marks metadata describes the synthesized speech, and by using it alongside the speech audio stream can determine the beginning and ending of sounds, words, sentences, and SSML tags. With the new Speech Marks, developers can now create lip-syncing avatars, visually highlighted read-along experiences, and integrate speech capabilities into the gaming engines like Amazon Lumberyard to give a voice to the characters.

There are four types of speech marks:

  • Sentence: Specifies a sentence element in the input text
  • Word: Indicates a word element in the input text
  • Viseme: Illustrates the position of the face and mouth corresponding to the sound that is spoken
  • Speech Synthesis Markup Language (SSML): Describes a <mark> element from the SSML input text.

Whispering is a speech effect similar to pitch, tempo, and loudness, in that it provides developers with yet one more expressive voice feature with which they can now modify the Text-to-Speech output. The whispering feature allows developers to have words from their input text spoken in a whispered voice using <amazon:effect name=”whispered”> SSML element.

Let’s take a quick look at both these new features.

 

Using Speech Marks

I’ll jump into an example of using speech marks with Amazon Polly in the AWS Console. I’ll go first to the Amazon Polly console and press the Get started button.

I’m taken to the Text-To-Speech menu option, and I select the SSML tab under the Text-to-Speech section. I will simply add two sentences that I wish to be spoken in the provided text field and then select a Voice.

I’ll verify the sentences are in the form that I wish them to be spoken by clicking the Listen to Speech button. Since I like what I hear, I will proceed with adding the speech marks metadata. In order to use speech marks, I will select the Change file format link.

When the Change file format dialog box comes up, I will select the File Format option, Speech Marks, and under the Speech Mark Types section, I will choose: Word and Sentence, by checking the checkboxes beside each speech mark type. Now I will click the Change button.


This returns me to the Text-To-Speech section of the console, and I can now click the Download Speech Marks button to see the generated speech marks.

The file downloaded has a .marks extension and contains JSON, and contains information about the start and end of each of my sentences and words. The JSON fields are:

  • Time: timestamp in milliseconds from the beginning of the audio stream
  • Type: type of speech mark (sentence, word, viseme, or ssml)
  • Start: offset in bytes from the start of the object in the input text (not including viseme marks)
  • End: offset in bytes of the object’s end in the input text (not including viseme marks)
  • Value – data that varies based on the type of speech mark, i.e. sentence speech mark contains the entire sentence in the text

 

Using Whispering

As I noted previously, using the Whispering feature allows me to have my input text be spoken in a whispered voice using the SSML amazon:effect element with a name attribute value of whispered. I’ll use my example above and insert SSML elements to have some of my text spoken using a using a whispered voice.

I’ll return to the Amazon Polly console and in the text box change my current text to use the new whispered voice feature for the sentence, “My name is Tara”. To accomplish this I will use the following SSML element: <amazon:effect name=”whispered”>. Therefore, the final sentence with SSML marks I entered into the text box looks as follows:

<speak>Hi!<amazon:effect name="whispered">My name is Tara.</amazon:effect>I am excited to talk about Polly's new features.</speak>

When I click the Listen to speech button, I will hear that the sentence, “My name is Tara” is indeed spoken in a whispered voice.

I want to download my speech output, so I will click the Change file format link. When the Change file format dialog box comes up, I will select the MP3 option under File format section then click the Change button.


Now I have the option to download my file by clicking the Download MP3 button.

You can hear my speech output using the new whispered voice by clicking here.

Summary

The Speech Marks and Whispering features are available in Amazon Polly starting today. To learn more about these and other features visit the Amazon Polly developer guide found here: http://docs.aws.amazon.com/polly/latest/dg

For more information about Amazon Polly, visit the Amazon Polly product page or get started by converting your text to speech in the Amazon Polly console.

You should give your text the gift of voice with Amazon Polly today.

Tara

New- Introducing AWS CodeStar – Quickly Develop, Build, and Deploy Applications on AWS

by Tara Walker | on | in AWS CodeStar, Developer Tools, Launch | | Comments

It wasn’t too long ago that I was on a development team working toward completing a software project by a release deadline and facing the challenges most software teams face today in developing applications. Challenges such as new project environment setup, team member collaboration, and the day-to-day task of keeping track of the moving pieces of code, configuration, and libraries for each development build. Today, with companies’ need to innovate and get to market faster, it has become essential to make it easier and more efficient for development teams to create, build, and deploy software.

Unfortunately, many organizations face some key challenges in their quest for a more agile, dynamic software development process. The first challenge most new software projects face is the lengthy setup process that developers have to complete before they can start coding. This process may include setting up of IDEs, getting access to the appropriate code repositories, and/or identifying infrastructure needed for builds, tests, and production.

Collaboration is another challenge that most development teams may face. In order to provide a secure environment for all members of the project, teams have to frequently set up separate projects and tools for various team roles and needs. In addition, providing information to all stakeholders about updates on assignments, the progression of development, and reporting software issues can be time-consuming.

Finally, most companies desire to increase the speed of their software development and reduce the time to market by adopting best practices around continuous integration and continuous delivery. Implementing these agile development strategies may require companies to spend time in educating teams on methodologies and setting up resources for these new processes.

Now Presenting: AWS CodeStar

To help development teams ease the challenges of building software while helping to increase the pace of releasing applications and solutions, I am excited to introduce AWS CodeStar.

AWS CodeStar is a cloud service designed to make it easier to develop, build, and deploy applications on AWS by simplifying the setup of your entire development project. AWS CodeStar includes project templates for common development platforms to enable provisioning of projects and resources for coding, building, testing, deploying, and running your software project.

The key benefits of the AWS CodeStar service are:

  • Easily create new projects using templates for Amazon EC2, AWS Elastic Beanstalk, or AWS Lambda using five different programming languages; JavaScript, Java, Python, Ruby, and PHP. By selecting a template, the service will provision the underlying AWS services needed for your project and application.
  • Unified experience for access and security policies management for your entire software team. Projects are automatically configured with appropriate IAM access policies to ensure a secure application environment.
  • Pre-configured project management dashboard for tracking various activities, such as code commits, build results, deployment activity and more.
  • Running sample code to help you get up and running quickly enabling you to use your favorite IDEs, like Visual Studio, Eclipse, or any code editor that supports Git.
  • Automated configuration of a continuous delivery pipeline for each project using AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and AWS CodeDeploy.
  • Integration with Atlassian JIRA Software for issue management and tracking directly from the AWS CodeStar console

With AWS CodeStar, development teams can build an agile software development workflow that not only increases the speed in which teams can deploy software and bug fixes, but also enables developers to build software that is more inline with customers’ requests and needs.

An example of a responsive development workflow using AWS CodeStar is shown below:

Journey Into AWS CodeStar

Now that you know a little more about the AWS CodeStar service, let’s jump into using the service to set up a web application project. First, I’ll go to into the AWS CodeStar console and click the Start a project button.

If you have not setup the appropriate IAM permissions, AWS CodeStar will show a dialog box requesting permission to administer AWS resources on your behalf. I will click the Yes, grant permissions button to grant AWS CodeStar the appropriate permissions to other AWS resources.

However, I received a warning that I do not have administrative permissions to AWS CodeStar as I have not applied the correct policies to my IAM user. If you want to create projects in AWS CodeStar, you must apply the AWSCodeStarFullAccess managed policy to your IAM user or have an IAM administrative user with full permissions for all AWS services.

Now that I have added the aforementioned permissions in IAM, I can now use the service to create a project. To start, I simply click on the Create a new project button and I am taken to the hub of the AWS CodeStar service.

At this point, I am presented with over twenty different AWS CodeStar project templates to choose from in order to provision various environments for my software development needs. Each project template specifies the AWS Service used to deploy the project, the supported programming language, and a description of the type of development solution implemented. AWS CodeStar currently supports the following AWS Services: Amazon EC2, AWS Lambda, and AWS Elastic Beanstalk. Using preconfigured AWS CloudFormation templates, these project templates can create software development projects like microservices, Alexa skills, web applications, and more with a simple click of a button.

For my first AWS CodeStar project, I am going to build a serverless web application using Node.js and AWS Lambda using the Node.js/AWS Lambda project template.

You will notice for this template AWS CodeStar sets up all of the tools and services you need for a development project including an AWS CodePipeline connected with the services; AWS CodeBuild, AWS CloudFormation, and Amazon CloudWatch. I’ll name my new AWS CodeStar project, TaraWebProject, and click Create Project.

Since this is my first time creating an AWS CodeStar, I will see a dialog that asks about the setup of my AWS CodeStar user settings. I’ll type Tara in the textbox for the Display Name and add my email address in the Email textbox. This information is how I’ll appear to others in the project.

The next step is to select how I want to edit my project code. I have decided to edit my TaraWebProject project code using the Visual Studio IDE. With Visual Studio, it will be essential for me to configure it to use the AWS Toolkit for Visual Studio 2015 to access AWS resources while editing my project code. On this screen, I am also presented with the link to the AWS CodeCommit Git repository that AWS CodeStar configured for my project.

The provisioning and tool setup for my software development project is now complete. I’m presented with the AWS CodeStar dashboard for my software project, TaraWebProject, which allows me to manage the resources for the project. This includes the management of resources, such as code commits, team membership and wiki, continuous delivery pipeline, Jira issue tracking, project status and other applicable project resources.

What is really cool about AWS CodeStar for me is that it provides a working sample project from which I can start the development of my serverless web application. To view the sample of my new web application, I will go to the Application endpoints section of the dashboard and click the link provided.

A new browser window will open and will display the sample web application AWS CodeStar generated to help jumpstart my development. A cool feature of the sample application is that the background of the sample app changes colors based on the time of day.

Let’s now take a look at the code used to build the sample website. In order to view the code, I will back to my TaraWebProject dashboard in the AWS CodeStar console and select the Code option from the sidebar menu.

This takes me to the tarawebproject Git repository in the AWS CodeCommit console. From here, I can manually view the code for my web application, the commits made in the repo, the comparison of commits or branches, as well as, create triggers in response to my repo events.

This provides a great start for me to start developing my AWS hosted web application. Since I opted to integrate AWS CodeStar with Visual Studio, I can update my web application by using the IDE to make code changes that will be automatically included in the TaraWebProject every time I commit to the provisioned code repository.

You will notice that on the AWS CodeStar TaraWebProject dashboard, there is a message about connecting the tools to my project repository in order to work on the code. Even though I have already selected Visual Studio as my IDE of choice, let’s click on the Connect Tools button to review the steps to connecting to this IDE.

Again, I will see a screen that will allow me to choose which IDE: Visual Studio, Eclipse, or Command Line tool that I wish to use to edit my project code. It is important for me to note that I have the option to change my IDE choice at any time while working on my development project. Additionally, I can connect to my Git AWS CodeCommit repo via HTTPS and SSH. To retrieve the appropriate repository URL for each protocol, I only need to select the Code repository URL dropdown and select HTTPS or SSH and copy the resulting URL from the text field.

After selecting Visual Studio, CodeStar takes me to the steps needed in order to integrate with Visual Studio. This includes downloading the AWS Toolkit for Visual Studio, connecting the Team Explorer to AWS CodeStar via AWS CodeCommit, as well as, how to push changes to the repo.

After successfully connecting Visual Studio to my AWS CodeStar project, I return to the AWS CodeStar TaraWebProject dashboard to start managing the team members working on the web application with me. First, I will select the Setup your team tile so that I can go to the Project Team page.

On my TaraWebProject Project Team page, I’ll add a team member, Jeff, by selecting the Add team member button and clicking on the Select user dropdown. Team members must be IAM users in my account, so I’ll click on the Create new IAM user link to create an IAM accounts for Jeff.

When the Create IAM user dialog box comes up, I will enter an IAM user name, Display name, and Email Address for the team member, in this case, Jeff Barr. There are three types of project roles that Jeff can be granted, Owner, Contributor, or Viewer. For the TaraWebProject application, I will grant him the Contributor project role and allow him to have remote access by select the Remote access checkbox. Now I will create Jeff’s IAM user account by clicking the Create button.

This brings me to the IAM console to confirm the creation of the new IAM user. After reviewing the IAM user information and the permissions granted, I will click the Create user button to complete the creation of Jeff’s IAM user account for TaraWebProject.

After successfully creating Jeff’s account, it is important that I either send Jeff’s login credentials to him in email or download the credentials .csv file, as I will not be able to retrieve these credentials again. I would need to generate new credentials for Jeff if I leave this page without obtaining his current login credentials. Clicking the Close button returns me to the AWS CodeStar console.

Now I can complete adding Jeff as a team member in the TaraWebProject by selecting the JeffBarr-WebDev IAM role and clicking the Add button.

I’ve successfully added Jeff as a team member to my AWS CodeStar project, TaraWebProject enabling team collaboration in building the web application.

Another thing that I really enjoy about using the AWS CodeStar service is I can monitor all of my project activity right from my TaraWebProject dashboard. I can see the application activity, any recent code commits, and track the status of any project actions, such as the results of my build, any code changes, and the deployments from in one comprehensive dashboard. AWS CodeStar ties the dashboard into Amazon CloudWatch with the Application activity section, provides data about the build and deployment status in the Continuous Deployment section with AWS CodePipeline, and shows the latest Git code commit with AWS CodeCommit in the Commit history section.

Summary

In my journey of the AWS CodeStar service, I created a serverless web application that provisioned my entire development toolchain for coding, building, testing, and deployment for my TaraWebProject software project using AWS services. Amazingly, I have yet to scratch the surface of the benefits of using AWS CodeStar to manage day-to-day software development activities involved in releasing applications.

AWS CodeStar makes it easy for you to quickly develop, build, and deploy applications on AWS. AWS CodeStar provides a unified user interface, enabling you to easily manage your software development activities in one place. AWS CodeStar allows you to choose from various templates to setting up projects using AWS Lambda, Amazon EC2, or AWS Elastic Beanstalk. It comes pre-configured with a project management dashboard, an automated continuous delivery pipeline, and a Git code repository using AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and AWS CodeDeploy allowing developers to implement modern agile software development best practices. Each AWS CodeStar project gives developers a head start in development by providing working code samples that can be used with popular IDEs that support Git. Additionally, AWS CodeStar provides out of the box integration with Atlassian JIRA Software providing a project management and issue tracking system for your software team directly from the AWS CodeStar console.

You can get started using the AWS CodeStar service for developing new software projects on AWS today. Learn more by reviewing the AWS CodeStar product page and the AWS CodeStar user guide documentation.

Tara

Launch: Amazon Athena adds support for Querying Encrypted Data

by Tara Walker | on | in Amazon Athena, Key Management Service, Launch | | Comments

In November of last year, we brought a service to the market that we hoped would be a major step toward helping those who have the need to securely access and examine massive amounts of data on a daily basis.  This service is none other than Amazon Athena which I think of as a managed service that is attempting “to leap tall queries in a single bound” with querying of object storage. A service that provides AWS customers the power to easily analyze and query large amounts of data stored in Amazon S3.

Amazon Athena is a serverless interactive query service that enables users to easily analyze data in Amazon S3 using standard SQL. At Athena’s core is Presto, a distributed SQL engine to run queries with ANSI SQL support and Apache Hive which allows Athena to work with popular data formats like CSV, JSON, ORC, Avro, and Parquet and adds common Data Definition Language (DDL) operations like create, drop, and alter tables. Athena enables the performant query access to datasets stored in Amazon Simple Storage Service (S3) with structured and unstructured data formats.

You can write Hive-compliant DDL statements and ANSI SQL statements in the Athena Query Editor from the AWS Management Console, from SQL clients such as SQL Workbench by downloading and taking advantage of the Athena JDBC driver. Additionally, by using the JDBC driver you can run queries programmatically from your desired BI tools. You can read more about the Amazon Athena service from Jeff’s blog post during the service release in November.

After releasing the initial features of the Amazon Athena service, the Athena team kept with the Amazon tradition of focusing on the customer by working diligently to make your customer experience with the service better. Therefore, the team has added a feature that I am excited to announce; Amazon Athena now provides support for Querying Encrypted data in Amazon S3. This new feature not only makes it possible for Athena to provide support for querying encrypted data in Amazon S3, but also enables the encryption of data from Athena’s query results. Businesses and customers who have requirements and/or regulations to encrypt sensitive data stored in Amazon S3 are able to take advantage of the serverless dynamic queries Athena offers with their encrypted data.

 

Supporting Encryption

Before we dive into the using the new Athena feature, let’s take some time to review the supported encryption options that S3 and Athena supports for customers needing to secure and encrypt data. Currently, S3 supports encrypting data with AWS Key Management Service (KMS). AWS KMS is a managed service for the creation and management of encryption keys used to encrypt data. In addition, S3 supports customers using their own encryption keys to encrypt data. Since it is important to understand the encrypted options that Athena supports for datasets stored on S3, in the chart below I have provided a breakdown of the encryption options supported with S3 and Athena, as well as, noted when the new Athena table property, has_encrypted_data, is required for encrypted data access.

 

For more information on Amazon S3 encryption with AWS KMS or Amazon S3 Encryption options, review the information in the AWS KMS Developer Guide on How Amazon Simple Storage Service (Amazon S3) Uses AWS KMS and Amazon S3 Developer Guide on Protecting Data Using Encryption respectively.

 

Creating & Accessing Encrypted Databases and Tables

As I noted before, there are a couple of ways to access Athena. Of course, you can access Athena through the AWS Management Console, but you also have the option to use the JDBC driver with SQL clients like SQL Workbench and other Business Intelligence tools. In addition, the JDBC driver allows for programmatic query access.

Enough discussion, it is time to dig into this new Athena service feature by creating a database and some tables, running queries from the table and encryption of the query results. We’ll accomplish all this by using encrypted data stored in Amazon S3.

If this is your first time logging into the service, you will see the Amazon Athena Getting Started screen as shown below. You would need to click the Get Started button to be taken the Athena Query Editor.

Now that we are in the Athena Query Editor, let’s create a database. If the sample database is shown when you open your Query Editor you would simply start typing your query statement in the Query Editor window to clear the sample query and create the new database.

I will now issue the Hive DDL Command, CREATE DATABASE <dbname> within the Query Editor window to create my database, tara_customer_db.

Once I receive the confirmation that my query execution was successful in the Results tab of Query Editor, my database should be created and available for selection in the dropdown.

I now will change my selected database in the dropdown to my newly created database, tara_customer_db.

 

 

With my database created, I am able to create tables from my data stored in S3. Since I did not have data encrypted with the various encryption types, the product group was kind enough to give me some sample data files to place in my S3 buckets. The first batch of sample data that I received was encrypted with SSE-KMS which if you recall from the encryption table matrix we discussed above is encryption type, Server-Side Encryption with AWS KMS–Managed Keys. I stored this set of encrypted data in my S3 bucket aptly named: aws-blog-tew-posts/SSE_KMS_EncryptionData. The second batch of sample data was encrypted with CSE-KMS, which is the encryption type, Client-Side Encryption with AWS, and is stored in my aws-blog-tew-posts/ CSE_KMS_EncryptionData S3 bucket. The last batch of data I received is just good old-fashioned plain text, and I have stored this data in the S3 bucket, aws-blog-tew-posts/PlainText_Table.

Remember to access my data in the S3 buckets from the Athena service, I must ensure that my data buckets have the correct permissions to allow Athena access each bucket and data contained therein. In addition, working with AWS KMS encrypted data requires users to have roles that include the appropriate KMS key policies. It is important to note that to successfully read KMS encrypted data, users must have the correct permissions for access to S3, Athena, and KMS collectively.

There are several ways that I can provide the appropriate access permissions between S3 and the Athena service:

  1. Allow access via user policy
  2. Allow access via bucket policy
  3. Allow access with both a bucket policy and user policy.

To learn more about the Amazon Athena access permissions and/or the Amazon S3 permissions by reviewing the Athena documentation on Setting User and Amazon S3 Bucket Permissions.

Since my data is ready and setup in my S3 buckets, I just need to head over to Athena Query Editor and create my first new table from the SSE-KMS encrypted data. My DDL commands that I will use to create my new table, sse_customerinfo, is as follows:

CREATE EXTERNAL TABLE sse_customerinfo( 
  c_custkey INT, 
  c_name STRING, 
  c_address STRING, 
  c_nationkey INT, 
  c_phone STRING, 
  c_acctbal DOUBLE, 
  c_mktsegment STRING, 
  c_comment STRING
  ) 
ROW FORMAT SERDE  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
STORED AS INPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
OUTPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' 
LOCATION  's3://aws-blog-tew-posts/SSE_KMS_EncryptionData';

I will enter my DDL command statement for the sse_customerinfo table creation into my Athena Query Editor and click the Run Query button. The Results tab will note that query was run successfully and you will see my new table show up under the tables available for the tara_customer_db database.

I will repeat this process to create my cse_customerinfo table from the CSE-KMS encrypted batch of data and then the plain_customerinfo table from the unencrypted data source stored in my S3 bucket. The DDL statements used to create my cse_customerinfo table are as follows:

CREATE EXTERNAL TABLE cse_customerinfo (
  c_custkey INT, 
  c_name STRING, 
  c_address STRING, 
  c_nationkey INT, 
  c_phone STRING, 
  c_acctbal DOUBLE, 
  c_mktsegment STRING, 
  c_comment STRING
)
ROW FORMAT SERDE   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
STORED AS INPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
OUTPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION   's3://aws-blog-tew-posts/CSE_KMS_EncryptionData'
TBLPROPERTIES ('has_encrypted_data'='true');

Again, I will enter my DDL statements above into the Athena Query Editor and click the Run Query button. If you review the DDL statements used to create the cse_customerinfo table carefully, you will notice a new table property (TBLPROPERTIES) flag, has_encrypted_data, was introduced with the new Athena encryption capability. This flag is used to tell Athena that the data in S3 to be used with queries for the specified table is encrypted data. If take a moment and refer back to the encryption matrix table we I reviewed earlier for the Athena and S3 encryption options, you will see that this flag is only required when you are using the Client-Side Encryption with AWS KMS–Managed Keys option. Once the cse_customerinfo table has been successfully created, a key symbol will appear next to the table identifying the table as an encrypted data table.

Finally, I will create the last table, plain_customerinfo, from our sample data. Same steps as we performed for the previous tables. The DDL commands for this table are:

CREATE EXTERNAL TABLE plain_customerinfo(
  c_custkey INT, 
  c_name STRING, 
  c_address STRING, 
  c_nationkey INT, 
  c_phone STRING, 
  c_acctbal DOUBLE, 
  c_mktsegment STRING, 
  c_comment STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION 's3://aws-blog-tew-posts/PlainText_Table';


Great! We have successfully read encrypted data from S3 with Athena, and created tables based on the encrypted data. I can now run queries against my newly created encrypted data tables.

 

Running Queries

Running Queries against our new database tables is very simple. Again, common DDL statements and commands can be used to create queries against your data stored in Amazon S3. For our query review, I am going to use Athena’s preview data feature. In the list of tables, you will see two icons beside the tables. One icon is a table property icon, selecting this will bring up the selected table properties, however, the other icon, displayed as an eye symbol, and is the preview data feature that will generate a simple SELECT query statement for the table.

 

 

To demonstrate running queries with Athena, I have selected to preview data for my plain_customerinfo by selecting the eye symbol/icon next to the table. The preview data feature creates the following DDL statement:

SELECT * FROM plain_customerinfo limit 10;

The query results from using the preview data feature with my plain_customerinfo table are displayed in the Results tab of the Athena Query Editor and provides the option to download the query results by clicking the file icon.

The new Athena encrypted data feature also supports encrypting query results and storing these results in Amazon S3. To take advantage of this feature with my query results, I will now encrypt and save my query data in a bucket of my choice. You should note that the data table that I have selected is currently unencrypted.
First, I’ll select the Athena Settings menu and the review the current storage settings for my query results. Since I do not have a KMS key to use for encryption, I will select the Create KMS key hyperlink and create a KMS key for use in encrypting my query results with Athena and S3. For details on how to create a KMS key and configure the appropriate user permissions, please see http://docs.aws.amazon.com/kms/latest/developerguide/create-keys.html.

After successfully creating my s3encryptathena KMS key and copying the key ARN for use in my Athena settings, I return to the Athena console Settings dialog and select the Encrypt query results textbox. I, then update the Query result location textbox point to my s3 bucket, aws-athena-encrypted, which will be the location for storing my encrypted query results.

The only thing that is left is to select the Encryption type and enter my KMS key. I can do this by either selecting the s3encryptathena key from the Encryption key dropdown or enter its ARN in the KMS key ARN textbox. In this example, I have chosen to use SSE-KMS for the encryption type. You can see both examples of selecting the KMS key below. Clicking the Save button completes the process.

Now I will rerun my current query for my plain_customerinfo table. Remember this table is not encrypted, but with the Athena settings changes made for adding encryption for the query results, I have enabled the query results run against this table to be stored with SSE-KMS encryption using my KMS key.

After my query rerun, I can see the fruits of my labor by going to the Amazon S3 console and viewing the CSV data files saved in my designated bucket, aws-athena-encrypted, and the SSE-KMS encryption of the bucket and files.

 

Summary

Needless to say, this Athena launch has several benefits for those needing to secure data via encryption while still retaining the ability to perform queries and analytics for data stored in varying data formats. Additionally, this release includes improvements I did not dive into with this blog post.

  • A new version of the JDBC driver that supports new encryption feature and key updates.
  • Added the ability to add, replace, and change columns using ALTER TABLE.
  • Added support for querying LZO-compressed data.

See the release documentation in the Athena user guide to more details and start leveraging Athena to query your encrypted data stored in Amazon S3 now, by reviewing the Configuring Encryption Options section in the Athena documentation.

Learn more about Athena and serverless queries on Amazon S3 by visiting the Athena product page or reviewing the Athena User Guide. In addition, you can dig deeper on the functionality of Athena and data encryption with S3 by reviewing the AWS Big Data Blog post: Analyzing Data in S3 using Amazon Athena and the AWS KMS Developer Guide.

Happy Encrypting!

Tara

Amazon CloudWatch launches Alarms on Dashboards

by Tara Walker | on | in Amazon CloudWatch, Launch | | Comments

Amazon CloudWatch is a service that gives customers the ability to monitor their applications, systems, and solutions running on Amazon Web Services by providing and collecting metrics, logs, and events about AWS resources in real time. CloudWatch automatically provides key resource measurements such as; latency, error rates, and CPU usage, while also enabling monitoring of custom metrics via customer-supplied logs and system data.

Last November, Amazon CloudWatch added new Dashboard Widgets to provide additional data visualization options for all available metrics. In order to provide customers with even more insight into their solutions and resources running on AWS, CloudWatch has launched Alarms on Dashboards. With this alarms enhancement, customers can view alarms and metrics in the same dashboard widget enabling them to perform data-driven troubleshooting and analysis.

CloudWatch dashboards are designed with a goal of providing better visibility when monitoring AWS resources across regions in a consolidated view. Since CloudWatch dashboards are highly customizable, users can create their own custom dashboards to graphically represent data for varying metrics such as utilization, performance, estimated billing, and now alarm conditions. An alarm tracks a single metric over time based on the value of the metric in relation to a specified threshold. When the alarm state changes, an action such an Auto Scaling policy is executed or a notification is sent to Amazon SNS, among other options.

With the ability to add alarms to dashboards, CloudWatch users have another mechanism to proactively monitor and receive alerts about their AWS resources and applications across multiple regions. In addition, the metric data associated with an alarm, which has been added to a dashboard, can be charted and reviewed. Alarms have three possible states:

  • OK: The value of the alarm metric does not meet the threshold
  • INSUFFICIENT DATA: Initial triggering of alarm metric or alarm metric data does not have enough data to determine whether it’s in the OK state or the ALARM state
  • ALARM: The value of the alarm metric meets the threshold

When added to a dashboard, alarms are displayed in red when in the Alarm state, gray when in the Insufficient data state and shown with no color fill when the alarm is in the OK state. Alarms added to a dashboard are supported with the following widgets: Line, Number, and Stacked Graph widgets.

  • Number widget: provides a quick and efficient view of the latest value of any desired metric. Using the widget with alarms, the view of the state of the alarm is shown with different background colors for the latest metric data.
  • Line widget: allows the visualization of the actual value of any collection of chosen metrics. Provides a view on the dashboard of the state of the alarm, which displays the alarm threshold and condition as a horizontal line. The threshold line can act as a good indicator to view the degree of the alarm.
  • Stack graph widget: allows customers to visualize the net total effect of any collection of chosen metrics. The stacked graph widget loads one metric over another in order to illustrate the distribution and contribution of a metric and has the option to display the contribution of metrics in percentages. With alarms, it also provides a view of the state of the alarm, which displays the alarm threshold and condition as a horizontal line.

Currently, adding multiple metrics onto the same widget for an alarm is in the works and this feature is evolving based on customer feedback.

Adding Alarms on Dashboards

Let’s take a quick look at the utilizing the Alarms on a CloudWatch Dashboard. In the AWS Console, I will go to the CloudWatch service. When in the CloudWatch console, select Dashboards. I will click the Create dashboard button and create the CloudWatchBlog dashboard.

 

Upon creation of my CloudWatchBlog dashboard, a dialog box will open to allow me to add widgets to the dashboard. I will forego adding widgets for now since I want to focus on adding alarms on my dashboard. Therefore, I will hit the Cancel button here and go to the Alarms section of the CloudWatch console.

Once in the Alarms section of the CloudWatch console, you will see all of your alarms and the state of each of the alarms for the current region displayed.

As we mentioned earlier, there are three types of alarm states and as you can see in my console above that all of the different alarms states for various alarms are being displayed. If desired, you can adjust your filter on the console to display alarms filtered by the alarm state type.

As an example, I am only interested in viewing the alarms with an alarm state of ALARM. Therefore, I will adjust the filter to show only the alarms in the current region with an alarm state as ALARM.

Now only the two alarms that have a current alarm state of ALARM are displayed. One of these alarms is for monitoring the provisioned write capacity units of an Amazon DynamoDB table, and the other is to monitor the CPU utilization of my active Amazon Elasticsearch instance.

Let’s examine the scenario in which I leverage my CloudWatchBlog dashboard as my troubleshooting mechanism for identifying and diagnosing issues with my Elasticsearch solution and its instances. I will first add the Amazon Elasticsearch CPU utilization alarm, ES Alarm, to my CloudWatchBlog dashboard. To add the alarm, I simply select the checkbox by the desired alarm, which in this case is ES Alarm. Then with the alarm selected, I click the Add to Dashboard button.

The Add to dashboard dialog box will open, allowing me to select my CloudWatchBlog dashboard. Additionally, I can select the widget type I would like to use for the display of my alarm. For the ES Alarm, I will choose the Line widget and complete the process of adding this alarm to my dashboard by clicking the Add to dashboard button.

Upon successfully adding ES Alarm to the CloudWatchBlog dashboard, you will see a confirmation notice displayed in the CloudWatch console.

If I then go to the Dashboard section of the console and select my CloudWatchBlog dashboard, I will see the line widget for my alarm, ES Alarm, on the dashboard. To ensure that my ES Alarm widget is a permanent part of the dashboard, I will click the Save dashboard button to preserve the addition of this widget on the dashboard.

As we discussed, one of the benefits of utilizing a CloudWatch dashboard is the ability to add several alarms from various regions onto a dashboard. Since my scenario is leveraging my dashboard as a troubleshooting mechanism for my Elasticsearch solution, I would like to have several alarms and metrics related to my solution displayed on the CloudWatchBlog dashboard. Given this, I will create another alarm for my Elasticsearch instance and add it to my dashboard.

I will first return to the Alarms section of the console and click the Create Alarm button.

The Create Alarm dialog box is displayed showing all of the current metrics available in this region. From the summary, I can quickly see that there are 21 metrics being tracked for Elasticsearch. I will click on the ES Metrics link to view the individual metrics that can be used to create my alarm.

I can review the individual metrics shown for my Elasticsearch instance, and choose which metric I want to base my new alarm on. In this case, I choose the WriteLatency metric by selecting the checkbox for this metric and then click the Next button.

 

The next screen is where I fill in all the details about my alarm: name, description, alarm threshold, time period, and alarm action. I will name my new alarm, ES Latency Alarm, and complete the rest of the aforementioned data fields. To complete the creation of my new alarm, I click the Create Alarm button.

I will see a confirmation message box at the top of the Alarms console upon successful completion of adding the alarm, and the status of the newly created alarm will be displayed in the alarms list.

Now I will add my ES Latency Alarm to my CloudWatchBlog dashboard. Again, I click on the checkbox by the alarm and then click the Add to Dashboard button.

This time when the Add to Dashboard dialog comes up, I will choose the Stacked area widget to display the ES Latency Alarm on my CloudWatchBlog dashboard. Clicking the Add to Dashboard button will complete the addition of my ES Latency Alarm widget to the dashboard.

Once back in the console, again I will see the confirmation noting the successful addition of the widget. I go to the Dashboards and click on the CloudWatchBlog dashboard and I can now view the two widgets in my dashboard. To include this widget in the dashboard permanently, I click the Save dashboard button.

The final thing to note about the new CloudWatch feature, Alarms on Dashboards, is that alarms and metrics from other regions can be added to the dashboard for a complete view for troubleshooting. Let’s add a metric to the dashboard with the alarms widget.

Within the console, I will move from my current region, US East (Ohio), to the US East (N. Virginia) region.

Now I will go to the Metric section of the CloudWatch console. This section displays the metrics from services used in the US East (N. Virginia) region.

My Elasticsearch solution triggers Lambda functions to capture all of the EmployeeInfo DynamoDB database CRUD (Create, Read, Update, Delete) changes via DynamoDB streams and write those changes into my Elasticsearch domain, taratestdomain. Therefore, I will add metrics to my CloudWatchBlog dashboard to track table metrics from DynamoDB.

Therefore, I am going to add the EmployeeInfo database ProvisionedWriteCapacityUnits metric to my CloudWatchBlog dashboard.

Back again in the Add to Dashboard dialog, I will select my CloudWatchBlog dashboard and choose to display this metric using the Number widget.

Now, the ProvisionedWriteCapacityUnits metric from the US East (N. Virginia) is displayed in the CloudWatchBlog dashboard with the Number widget added to the dashboard to with the alarms from the US East (Ohio). To make this update permanent in the dashboard, I will (you guessed it!) click the Save dashboard button.

Summary

Getting started with alarms on dashboards is easy. You can use alarms on dashboards across regions for another means of proactively monitoring alarms, build troubleshooting playbooks, and view desired metrics. You can also choose the metric first in the Metric UI and then change the type of widget according to the visualization that fits the metric.

Alarms on Dashboards are supported on Line, Stacked Area, and Number widgets. In addition, you can use Text widgets next to alarms on a dashboard to add steps or observations on how to handle changes in the alarm state. To learn more about Amazon CloudWatch widgets and about the additional dashboard widgets, visit the Amazon CloudWatch documentation and the CloudWatch Getting Started guide.

 

Tara

New – Amazon EMR Instance Fleets

by Randall Hunt | on | in Amazon EMR, Launch | | Comments

Today we’re excited to introduce a new feature for Amazon EMR clusters called instance fleets. Instance fleets gives you a wider variety of options and intelligence around instance provisioning. You can now provide a list of up to 5 instance types with corresponding weighted capacities and spot bid prices (including spot blocks)! EMR will automatically provision On-Demand and Spot capacity across these instance types when creating your cluster. This can make it easier and more cost effective to quickly obtain and maintain your desired capacity for your clusters.

You can also specify a list of Availability Zones and EMR will optimally launch your cluster in one of the AZs. EMR will also continue to rebalance your cluster in the case of Spot instance interruptions by replacing instances with any of the available types in your fleet. This will make it easier to maintain your cluster’s overall capacity. Instance fleets can be used instead of instance groups. Just like groups your cluster will have master, core, and task fleets.

Let’s take a look at the console updates to get an idea of how these fleets work.

We’ll start by navigating to the EMR console and clicking the Create Cluster button. That should bring us to our familiar EMR provisioning console where we can navigate to the advanced options near the top left.

We’ll select the latest EMR version (instance fleets are available for EMR versions 4.8.0 and greater, with the exception of 5.0.x) and click next.

Now we get to the good stuff! We’ll select the new instance fleet option in the hardware options.
Screenshot 2017-03-09 00.30.51

Now what I want to do is modify our core group to have a couple of instance types that will satisfy the needs of my cluster.

CoreFleetScreenshot

EMR will provision capacity in each instance fleet and availability zone to meet my requirements in the most cost effective way possible. The EMR console provides an easy mapping of vCPU to weighted capacity for each instance type, making it easy to use vCPU as the capacity unit (I want 16 total vCPUs in my core fleet). If the vCPU units don’t match my criteria for weighting instance types I can change the “Target capacity” selector to include arbitrary units and define my own weights (this is how the API/CLI consume capacity units as well).

When the cluster is being provisioned if it’s unable to obtain the desired spot capacity within a user defined timeout you can have it terminate or fall back onto On-Demand instances to provision the rest of the capacity.

All this functionality for instance fleets is also available from the AWS SDKs and the CLI. Let’s take a look at how we would provision our own instance fleet.

First we’ll create our configuration json in my-fleet-config.json:

[
  {
    "Name": "MasterFleet",
    "InstanceFleetType": "MASTER",
    "TargetOnDemandCapacity": 1,
    "InstanceTypeConfigs": [{"InstanceType": "m3.xlarge"}]
  },
  {
    "Name": "CoreFleet",
    "InstanceFleetType": "CORE",
    "TargetSpotCapacity": 11,
    "TargetOnDemandCapacity": 11,
    "LaunchSpecifications": {
      "SpotSpecification": {
        "TimeoutDurationMinutes": 20,
        "TimeoutAction": "SWITCH_TO_ON_DEMAND"
      }
    },
    "InstanceTypeConfigs": [
      {
        "InstanceType": "r4.xlarge",
        "BidPriceAsPercentageOfOnDemandPrice": 50,
        "WeightedCapacity": 1
      },
      {
        "InstanceType": "r4.2xlarge",
        "BidPriceAsPercentageOfOnDemandPrice": 50,
        "WeightedCapacity": 2
      },
      {
        "InstanceType": "r4.4xlarge",
        "BidPriceAsPercentageOfOnDemandPrice": 50,
        "WeightedCapacity": 4
      }
    ]
  }
]

Now that we have our configuration we can use the AWS CLI’s ’emr’ subcommand to create a new cluster with that configuration:

aws emr create-cluster --release-label emr-5.4.0 \
--applications Name=Spark,Name=Hive,Name=Zeppelin \
--service-role EMR_DefaultRole \
--ec2-attributes InstanceProfile="EMR_EC2_DefaultRole,SubnetIds=[subnet-1143da3c,subnet-2e27c012]" \
--instance-fleets file://my-fleet-config.json

If you’re eager to get started the feature is available now at no additional cost and you can find detailed documentation to help you get started here.

Thanks to the EMR service team for their help writing this post!

Randall Hunt

Launch: Amazon ElastiCache Launches Enhanced Redis Backup and Restore with Cluster Resizing

by Tara Walker | on | in Amazon ElastiCache, Launch | | Comments

Most of us equate in-memory caching with improved performance and lower cost at scale when designing applications or building solutions. Now if there was only a service that would continually make it simpler to deploy and utilize in-memory cache in the cloud while increasing the ability to scale.

Okay no more joking around, the cloud service that provides this great functionality is, of course, Amazon ElastiCache. Amazon ElastiCache is an AWS managed service that provides a performant in-memory data store or cache in the cloud while offering a straightforward way to create, scale, and manage a distributed environment for low-latency, secure, access of I/O intensive or compute heavy data. Additionally, ElastiCache reduces the overhead of managing infrastructure for your in-memory data structure server or cache by detecting and replacing failed nodes while providing enhanced visibility into key performance metrics of the caching system nodes via Amazon CloudWatch. This exciting service is now launching support for Enhanced Redis Backup and Restore with Cluster Resizing.

For those of you familiar with Amazon ElastiCache, you are likely aware that ElastiCache currently supports two in-memory key-value engines:

  • Memcached: an open source, high-performing, distributed memory object caching system developed in 2003 with the initial goal of speeding up dynamic web applications by alleviating database load
  • Redis: an open source in-memory data structure store launched in 2009 developed as a broker for caching, messaging, and databases with built-in replication, atomic operation support, various levels of on-disk persistence, and high availability via Redis Cluster.

In October of 2016, support was added for Redis Cluster with Redis 3.2.4. This allowed ElastiCache Redis users to, not only take advantage of Redis clusters, but also gave users the ability to:

  • Create cluster-level backups.
  • Produce snapshots of each of the cluster’s shards contained within backups.
  • Scale their workloads with 3.5TiB of data across up to 15 shards.

You can read more about using Redis with ElastiCache and the related features by reviewing the product page for Amazon ElastiCache for Redis.

With the launch of the Enhanced Backup and Restore with Cluster Resizing feature, ElastiCache is providing even deeper support for Redis with a clear-cut migration path to a managed Redis Cluster experience. There are several benefits of this enhancement for ElastiCache and Redis users alike, such as:

  • Ability to restore backup into a Redis Cluster with a different number of shards and slot distribution
  • Deliver the capability for users to resize Redis workloads
  • Allow Redis database file (RDB) snapshots as input for creating a sharded Redis Cluster
  • Offer option to use snapshot(s) of Redis on EC2 implementations (both Redis Cluster and single-shard Redis) as data input for sharded Redis Cluster creation

To accomplish these tasks, ElastiCache will parse the Redis key space across the backup’s individual snapshots, and redistribute the keys in the new Cluster according to the requested number of shards and hash slots. You would simply take your RDB snapshots and store them on S3, then provide ElastiCache with the desired number of shards and the snapshot file. ElastiCache handles the heavy lifting of restoring the Redis data store into a Redis cluster.

I am sure that you all may be thinking; Is it really that easy to leverage the Enhanced Redis Backup and Restore with Cluster Resizing feature in ElastiCache? Well, there is no time like the present to find out. Let’s take a trip to the AWS Management Console, and put this newly launched enhancement in action by restoring an external RDB snapshot to a new cluster using ElastiCache.

My first stop in the AWS Management console is to the Amazon S3 console. I have some Redis .rdb snapshot files I received from some of my peers here at AWS in order to test the restore of an external Redis snapshot to ElastiCache. I will need to put these files into Amazon S3 so that I can access the snapshots as input for my ElastiCache Redis cluster.

In the S3 console, I will go to my S3 bucket, aws-blog-tew-posts, that I created for testing and development purposes. I’ll upload the .rdb snapshot files that were provided to me into this S3 bucket.

 

It is important to note that the name of your S3 bucket must conform to DNS standards. To be DNS-compliant, the name must be at least three characters, must contain only lowercase letters, numbers, and/or dashes, and it must start and end with a lowercase letter or number. While this may be obvious, I will also note that the bucket name cannot be in an IP address format. You can learn more about the S3 Bucket Restrictions at the link provided here.

With my .rdb files successfully uploaded into my aws-blog-tew-posts bucket, I need to take note of the S3 path to these backup files. For these files, the path would be aws-blog-tew-posts/dump_1.rdb or aws-blog-tew-posts/dump_10.rdb. If you have placed your files into a folder, the folder name would need to be included in this path, i.e. thebucketname/thefoldername/thefilename.

For ElastiCache to access these files, I need to ensure that the service has read permissions for each of the files. To provide access, I will update the permissions for each of .rdb files by assigning the Grantee as the canonical id for my region and grant the user Open/Download permissions. The canonical id for all regions, outside of China (Beijing) and AWS GovCloud (US), is as follows:

540804c33a284a299d2547575ce1010f2312ef3da9b3a053c8bc45bf233e4353

After I click the Save button, I am all set to use these files as input for an ElastiCache Redis cluster.

The next step is to go to the ElastiCache console. Here I will create a new ElastiCache Redis cluster and seed this new cluster with data from one of the RDB snapshots located in the files in my S3 bucket. I’ll choose the dump_1.rdb snapshot file to use as my data input to seed this new cluster. Since I want to explore the ElastiCache Redis capabilities added on this past October with 3.2.4 support of Redis Cluster, as well as, discuss the new Backup and Restore with Cluster Resizing enhancements, I’ll create a new Redis Cluster and ensure I have cluster mode enabled. At this point, I should note that you cannot restore from a backup created using a Redis (cluster mode enabled) cluster to a Redis (cluster mode disabled) cluster.

First, I will click the Get Started Now button from the ElastiCache console dashboard or the Create button based upon your console view.

In the Create your Amazon ElastiCache cluster dialog window, I’ll select Redis for my caching and make sure I click the checkbox for Cluster Mode enabled (Scale Out). The name of my new cluster will be, tew-rediscluster and I since I am enabling a Cluster mode, my ElastiCache Redis Engine version is 3.2.4. For this cluster, I will keep the default Redis port of 6379.

The key benefit of the ElastiCache enhanced Redis Backup and Restore feature is the cluster resizing capability that allows me to build a new cluster with a different number of shards than was originally used for the backup file. To build the new Redis Cluster, I am using only one RDB snapshot file, dump_1.rdb which is a small Redis instance backup with only one shard. However, in the creation of my new tew-rediscluster, I have opted for 3 shards with 2 replicas per shard.

In addition, I have the ability to specify a node type for my new cluster that is a different size than my original instance from the RDB snapshot. As I mentioned, the dump_1.rdb is a backup of a Redis instance that is significantly smaller than the size of the chosen node type for my tew-rediscluster shown below.

There are other options and data input needed in order to complete the creation of my ElastiCache Redis cluster that I will not show in this blog post. However, if you want to go through each of the steps necessary for creating an ElastiCache Redis cluster you can find more information in the AWS ElastiCache Getting Started documentation for Launch a Cluster.

Once I have provided all the information needed to create my ElastiCache Redis cluster, I will need to tell ElastiCache how to seed the cluster with the .rdb file by providing the file location from my S3 bucket. In the Import Data to Cluster section of the create dialog, I will enter the S3 path to my dump_1.rdb in the Seed RDB file S3 location textbox. Remember, the nomenclature for the S3 file path is Bucket/Folder/ObjectName so I will enter aws-blog-tew-posts/dump_1.rdb as the path to the RDB file in S3. All that is left now is to click the Create button.

 

That’s it! ElastiCache goes to work to creating the new Redis cluster. After a short time period, the ElastiCache console shows my new Amazon ElastiCache Redis cluster as available and I have successfully created this cluster with data restored from an external RDB snapshot file.

 

I just demonstrated how you have the capability to create an ElastiCache Redis cluster using an external RDB snapshot, but of course, you can create backups and restore from backups from your existing ElastiCache Redis clusters as well. To dig deeper into information about this newly launched feature, visit Restoring From a Backup with Cluster Resizing in the Amazon ElastiCache User Guide.

To learn more about making your applications more performant with Amazon ElastiCache, visit the AWS Amazon ElastiCache page for product details, resources, and customer testimonials.

– Tara

Launch: Amazon GameLift Now Supports All C++ and C# Game Engines

by Tara Walker | on | in Amazon GameLift, Amazon GameLift Server SDK for C#, Amazon GameLift Server SDK for C++, Games, Launch | | Comments

Calling all Game Developers! GDC 2017 was a blast in San Francisco a couple of weeks ago, so there is no better time to be inspired and passionate about learning and building cool games.

Therefore, I am excited to share that Amazon GameLift is now available for all C++ and C# game engines, including Amazon Lumberyard, Unreal Engine, and Unity, all with enhanced game session matching capabilities. For those of you not familiar with Amazon GameLift, let me introduce this managed service designed to aid game developers in delivering fun and innovative online game experiences.

Amazon GameLift is a managed AWS service for hosting dedicated game servers, making it easier for game developers to scale their game capacity and match players into available game sessions. With Amazon GameLift, you can host servers, track game availability, defend game servers from distributed denial of service (DDoS) attacks, and deploy updates without taking your game offline. The Amazon GameLift service powers dedicated game servers for Amazon Game Studios, as well as external game development customers, and is designed to support session-based games with game loops that start and end within a specified time.

The latest Amazon GameLift release enhances the current functionality of the service, as well as adding awesome new features to help simplify game development and deployment for developers. Let us review some of the cool features of the Amazon GameLift service:

  • Multi-engine support: Initially, Amazon GameLift service could only be used with the Amazon Lumberyard game engine. The service is now enhanced to integrate with popular game engines like Unreal Engine, Unity, as well as, custom C# and C++ game engines.
  • New server SDK language support: In order to support a larger set of customers and developers, the service provides an Amazon GameLift Server SDK available for C# and C++. This includes an Unreal Engine plugin, which is a customized version of the C++ Server SDK that is compatible with the Unreal Engine API for Amazon GameLift.
  • Client SDK language support expansion: The Amazon GameLift Client SDK is bundled with the AWS SDK, which is available in a myriad of different languages. This allows game developers to build game clients with an integration of the Amazon GameLift service in their language of choice.
  • Matchmaking: Amazon GameLift continually scans available game servers around the world and matches them against player requests to join games. If low-latency game servers are not available, you can configure the service to automatically add more capacity near your players. Amazon GameLift maintains a queue of waiting players until new games start or new instances launch, then places waiting players into the lowest latency game.
  • Player data handling: Game developers can now store custom player information and pass it directly to a game server. A game server or other game entity with an API call can then retrieve Player data from Amazon GameLift.
  • Console Support: Amazon GameLift supports games developed and architected for Xbox One and PS4.

Amazon GameLift does the heavy lifting of tasks once required to create session-based multiplayer games by simplifying the process of deploying, scaling, and maintaining game servers while reducing the time, cost, and risks associated with building the infrastructure from scratch.

The reference architecture of a gaming solution that utilizes the Amazon GameLift would look as follows:

 

Integrating Amazon GameLift into Your Games

The process of integrating Amazon GameLift into your game build can be broken down in a few simple steps:

  1. Prepare your game server for hosting on Amazon GameLift by setting up your game server project with the Amazon GameLift Server SDK and adding communication code to the project.
  2. Package and upload your game server build to the AWS region targeted for game deployment
  3. Create and build a fleet of computing resources to host the game.
  4. Prepare your game client to connect to game sessions maintained by Amazon GameLift using the AWS SDK with Amazon GameLift APIs and add code to game client for calls to Amazon GameLift service and identifying the player region.
  5. Test your Amazon GameLift integration by connecting an Amazon GameLift-hosted game session and verifying game sessions are being created.

Let’s get started putting these steps into practice by setting up the Amazon GameLift Server SDK in a simple game server project using the Unreal game engine.

Unreal Engine (UE)

We start with Epic’s Unreal game engine. For simplicity, we will create the sample Shooter Game project with online multiplayer functionality built-in, and save it locally on the computer.

Now that I have the Multiplayer Shooter Game sample downloaded and open locally on my machine, I will need to be able to manipulate the C++ code to add the Amazon GameLift service to the UE Online Sub-System to manage the online game sessions. The Shooter Game sample is leveraging the Blueprints Visual Scripting system in Unreal Engine. The Blueprints system is a gameplay scripting system based on node-based interfaces in the UE editor, which enables game designers and content creators to create gameplay elements and functionality within UE editor.

Since it is my goal to use the Amazon GameLift C++ SDK to include the Amazon GameLift service in the game and alter the game code, I will need to create Visual Studio project solution to tie in the game and correlate the source code and any binaries from the Shooter Game to the project. To accomplish this I navigate to the context menu and select the File menu option. In the menu dropdown, I find and select the Generate Visual Studio Project Files option.

Once the project has generated, I only need to return to the Context menu and select File, then Open with Visual Studio in order to open the project and view the source code.

In preparation for adding the Amazon Game Lift service to the Shooter Game as the game service and for game session management, you will need to enable the OnlineSubSystem module in your project. In order to do this, open the game build settings file in the Visual Studio project. Since this game project is named ShooterGame, the build file is named ShooterGame.Build.cs and is located in the Source/ShooterGame folder(s) as shown below.

Open your Build files and uncomment the line for the OnlineSubsystemNull module. Since I am using the sample that already utilizes a multiplayer online system, my build options are set appropriately, and the code looks like this:

public class ShooterGame : ModuleRules
{
	public ShooterGame(TargetInfo Target)
	{
		PrivateIncludePaths.AddRange(
			new string[] { 
				"ShooterGame/Classes/Player",
				"ShooterGame/Private",
				"ShooterGame/Private/UI",
				"ShooterGame/Private/UI/Menu",
				"ShooterGame/Private/UI/Style",
				"ShooterGame/Private/UI/Widgets",
            		}
		);
       PublicDependencyModuleNames.AddRange(
			new string[] {
				"Core",
				"CoreUObject",
				"Engine",
				"OnlineSubsystem",
				"OnlineSubsystemUtils",
				"AssetRegistry",
             			"AIModule",
				"GameplayTasks",
			}
		);
       PrivateDependencyModuleNames.AddRange(
			new string[] {
				"InputCore",
				"Slate",
				"SlateCore",
				"ShooterGameLoadingScreen",
				"Json"
			}
		);
		DynamicallyLoadedModuleNames.AddRange(
			new string[] {
				"OnlineSubsystemNull",
				"NetworkReplayStreaming",
				"NullNetworkReplayStreaming",
				"HttpNetworkReplayStreaming"
			}
		);
		PrivateIncludePathModuleNames.AddRange(
			new string[] {
				"NetworkReplayStreaming"
			}
		);
	}
}

Now that we are set with the Shooter Game project, let’s turn our attention on the Amazon GameLift SDK. I want to leverage the C++ SDK as a plugin for the Unreal Engine, therefore, I need to compile the SDK using the using a compilation directive that builds the binaries for this game engine.

With the SDK source downloaded, I can compile the SDK from the source based upon my operating system. Since I am using a Windows machine for this project, I will complete the following steps:

  • Make an out directory to hold the binaries generated from the code compilation:

mkdir out

  • Change to the previously created directory:

cd out

  • Use CMake to specify a build system generator for VS 2015 Win x64 and set UE compilation flag:

cmake -DBUILD_FOR_UNREAL=1 -G “Visual Studio 14 2015 Win64” <source directory>

  • Build C++ project to create binaries using selected Build System (MS Build for this project):

msbuild ALL_BUILD.vcxproj /p:Configuration=Release

With my libraries compiled, I should have the following binary files required to use the Amazon GameLift Unreal Engine plugin.

Linux:

* out/prefix/lib/aws-cpp-sdk-gamelift-server.so

Windows:

* out\prefix\bin\aws-cpp-sdk-gamelift-server.dll

* out\prefix\lib\aws-cpp-sdk-gamelift-server.lib

As you can see below, since I am on Windows, my compiled Amazon GameLift libraries, aws-cpp-sdk-gamelift-server.dll and aws-cpp-sdk-gamelift-server.lib, are located in the prefix\bin and prefix\lib folders respectively.

After copying the binaries to the GameLiftSDK Unreal Engine plugin folder, my Amazon GameLift plugin folder is configured and ready to be added to an Unreal Engine game project.

Given this, it is now time to add the Amazon GameLift plugin to the Unreal Engine ShooterGame project. I could use the Unreal Engine Editor to add the plugin, but instead, I will stay in the Visual Studio project and add the plugin by updating the game directory and project file.

In Windows Explorer, I add a folder called Plugins in the ShooterGame directory and copy my prepared GameLiftServerSDK folder into the directory as noted by the Unreal Engine documentation on plugins.

Now I will open up the ShooterGame.Build.cs file, which is a C# file that holds information about game dependencies.

Within the file I will add the following code:

PublicDependencyModuleNames.AddRange(
            new string[] {
                "Core",
                "CoreUObject",
                "Engine",
                "InputCore",
                "GameLiftServerSDK"
            }
       );

Just to ensure all is in sync with the changes made thus far, I close Visual Studio, go back to the UE Editor, and select Refresh Visual Studio Project.

Upon completion, I select Open Visual Studio and the Plugins folder I added in the ShooterGame directory is now included in the project and able to be viewed in Solution Explorer.

Next, I rebuild my entire solution to get the Amazon GameLift SDK binaries integrated into the project.

I’ll go back to the UE Editor and select Build from the toolbar to ensure the aspects of the Amazon GameLift plugin are included in my ShooterGame. Once compilation is complete, a quick visit to the Settings toolbar and Plugins option shows that the Amazon GameLift plugin is added and is recognized in the project. I will select the Enabled checkbox, which will prompt me to restart the UE Editor. I select Restart Now and allow the Unreal Engine to rebuild the game code files.

Upon completion of the build, the editor will restart and reopen my ShooterGame.

Now things are set for the use of the Amazon GameLift SDK in the ShooterGame project.

With the Unreal editor open, I’ll go into the Open Visual Studio menu option to get back to the ShooterGame code. This will open up Visual Studio and the game code. With Visual Studio open, I go to the ShooterGameMode.cpp file to add the code to initialize the Amazon GameLift SDK. Some key things I must do in order to correctly add the code for Amazon GameLift within my Shooter game project are:

  1. Enclose the Amazon GameLift code within a preprocessor condition using the flag WITH_GAMELIFT=1
  2. Build a dedicated server in Unreal Engine for my targeted server OS ex. Linux
  3. Ensure my build target is a game server type i.e. Type == TargetRules.TargetType.Server

You can find an example of the code needed to add Amazon GameLift in your Unreal Engine project in the documentation here. In addition, you can learn how to build a dedicated server for Unreal Engine by following the Dedicated Server Guide for Windows and Linux provided in the Unreal Engine wiki. With these resources in hand, you should be well on your way to integrating Amazon GameLift into a game project.

I just did a quick review of incorporating the Amazon GameLift SDK in the Unreal Engine game engine, but don’t forget you have the option to add the Amazon GameLift SDK into C# engines like Unity. By downloading the Amazon GameLift Server SDK and compiling the .Net framework 3.5 solution, GameLiftServerSDKNet35.sln. The GameLiftServerSDKNet35.sln solution will enable you to add the Amazon GameLift libraries your Unity3D project. Review the Amazon GameLift SDK documentation, Using the C# Server SDK for Unity, in order to learn more about setting up and using the Amazon GameLift C# Server SDK plugin.

Summary

We reviewed just one of the new aspects added of the Amazon GameLift managed service, but the service provides game developers and game studios with even more. Amazon GameLift enables the building of distributed games by making it easy to manage infrastructure, scale capacity, and match players into available game sessions while defending games from DDoS attacks.

You can learn more about the Amazon GameLift service by reviewing the Amazon GameLift documentation, the Amazon GameLift developer guide and/or check out the Amazon GameLift tutorials on the Amazon GameDev tutorial page in order to hit the ground running with game development with Amazon GameLift service.

Happy Gaming!

– Tara

Amazon Elasticsearch Service support for Elasticsearch 5.1

by Tara Walker | on | in Amazon Elasticsearch Service, Launch | | Comments

The Amazon Elasticsearch Service is a fully managed service that provides easier deployment, operation, and scale for the Elasticsearch open-source search and analytics engine. We are excited to announce that Amazon Elasticsearch Service now supports Elasticsearch 5.1 and Kibana 5.1.

Elasticsearch 5 comes with a ton of new features and enhancements that customers can now take advantage of in Amazon Elasticsearch service. Elements of the Elasticsearch 5 release are as follow:

  • Indexing performance: Improved Indexing throughput with updates to lock implementation & async translog fsyncing
  • Ingestion Pipelines: Incoming data can be sent to a pipeline that applies a series of ingestion processors, allowing transformation to the exact data you want to have in your search index. There are twenty processors included, from simple appending to complex regex applications
  • Painless scripting: Amazon Elasticsearch Service supports Painless, a new secure and performant scripting language for Elasticsearch 5. You can use scripting to change the precedence of search results, delete index fields by query, modify search results to return specific fields, and more.
  • New data structures: Lucene 6 data structures, new data types; half_float, text, keyword, and more complete support for dots-in-fieldnames
  • Search and Aggregations: Refactored search API, BM25 relevance calculations, Instant Aggregations, improvements to histogram aggregations & terms aggregations, and rewritten percolator & completion suggester
  • User experience: Strict settings and body & query string parameter validation, index management improvement, default deprecation logging, new shard allocation API, and new indices efficiency pattern for rollover & shrink APIs
  • Java REST client: simple HTTP/REST Java client that works with Java 7 and handles retry on node failure, as well as, round-robin, sniffing, and logging of requests
  • Other improvements: Lazy unicast hosts DNS lookup, automatic parallel tasking of reindex, update-by-query, delete-by-query, and search cancellation by task management API

The compelling new enhancements of Elasticsearch 5 are meant to make the service faster and easier to use while providing better security. Amazon Elasticsearch Service is a managed service designed to aid customers in building, developing and deploying solutions with Elasticsearch by providing the following capabilities:

  • Multiple configurations of instance types
  • Amazon EBS volumes for data storage
  • Cluster stability improvement with dedicated master nodes
  • Zone awareness – Cluster node allocation across two Availability Zones in the region
  • Access Control & Security with AWS Identity and Access Management (IAM)
  • Various geographical locations/regions for resources
  • Amazon Elasticsearch domain snapshots for replication, backup and restore
  • Integration with Amazon CloudWatch for monitoring Amazon Elasticsearch domain metrics
  • Integration with AWS CloudTrail for configuration auditing
  • Integration with other AWS Services like Kinesis Firehose and DynamoDB for loading of real-time streaming data into Amazon Elasticsearch Service

Amazon Elasticsearch Service allows dynamic changes with zero downtime. You can add instances, remove instances, change instance sizes, change storage configuration, and make other changes dynamically.

The best way to highlight some of the aforementioned capabilities is with an example.

During a presentation at the IT/Dev conference, I demonstrated how to build a serverless employee onboarding system using Express.js, AWS Lambda, Amazon DynamoDB, and Amazon S3. In the demo, the information collected was personnel data stored in DynamoDB about an employee going through a fictional onboarding process. Imagine if the collected employee data could be searched, queried, and analyzed as needed by the company’s HR department. We can easily augment the onboarding system to add these capabilities by enabling the employee table to use DynamoDB Streams to trigger Lambda and store the desired employee attributes in Amazon Elasticsearch Service.

The result is the following solution architecture:

We will focus solely on how to dynamically store and index employee data to Amazon Elasticseach Service each time an employee record is entered and subsequently stored in the database.
To add this enhancement to the existing aforementioned onboarding solution, we will implement the solution as noted by the detailed cloud architecture diagram below:

Let’s look at how to implement the employee load process to the Amazon Elasticsearch Service, which is the first process flow shown in the diagram above.

Amazon Elasticsearch Service: Domain Creation

Let’s now visit the AWS Console to check out Amazon Elasticsearch Service with Elasticsearch 5 in action. As you probably guessed, from the AWS Console home, we select Elasticsearch Service under the Analytics group.

The first step in creating an Elasticsearch solution is to create a domain.  You will notice that now when creating an Amazon Elasticsearch Service domain, you now have the option to choose the Elasticsearch 5.1 version.  Since we are discussing the launch of the support of Elasticsearch 5, we will, of course, choose the 5.1 Elasticsearch engine version when creating our domain in the Amazon Elasticsearch Service.


After clicking Next, we will now setup our Elasticsearch domain by configuring our instance and storage settings. The instance type and the number of instances for your cluster should be determined based upon your application’s availability, network volume, and data needs. A recommended best practice is to choose two or more instances in order to avoid possible data inconsistencies or split brain failure conditions with Elasticsearch. Therefore, I will choose two instances/data nodes for my cluster and set up EBS as my storage device.

To understand how many instances you will need for your specific application, please review the blog post, Get Started with Amazon Elasticsearch Service: How Many Data Instances Do I Need, on the AWS Database blog.

All that is left for me is to set up the access policy and deploy the service. Once I create my service, the domain will be initialized and deployed.

Now that I have my Elasticsearch service running, I now need a mechanism to populate it with data. I will implement a dynamic data load process of the employee data to Amazon Elasticsearch Service using DynamoDB Streams.

Amazon DynamoDB: Table and Streams

Before I head to the DynamoDB console, I will quickly cover the basics.

Amazon DynamoDB is a scalable, distributed NoSQL database service. DynamoDB Streams provide an ordered, time-based sequence of every CRUD operation to the items in a DynamoDB table. Each stream record has information about the primary attribute modification for an individual item in the table. Streams execute asynchronously and can write stream records in practically real time. Additionally, a stream can be enabled when a table is created or can be enabled and modified on an existing table. You can learn more about DynamoDB Streams in the DynamoDB developer guide.

Now we will head to the DynamoDB console and view the OnboardingEmployeeData table.

This table has a primary partition key, UserID, that is a string data type and a primary sort key, Username, which is also of a string data type. We will use the UserID as the document ID in Elasticsearch. You will also notice that on this table, streams are enabled and the stream view type is New image. A stream that is set to a New image view type will have stream records that display the entire item record after it has been updated. You also have the option to have the stream present records that provide data items before modification, provide only the items’ key attributes, or provide old and new item information.  If you opt to use the AWS CLI to create your DynamoDB table, the key information to capture is the Latest Stream ARN shown underneath the Stream Details section. A DynamoDB stream has a unique ARN identifier that is outside of the ARN of the DynamoDB table. The stream ARN will be needed to create the IAM policy for access permissions between the stream and the Lambda function.

IAM Policy

The first thing that is essential for any service implementation is getting the correct permissions in place. Therefore, I will first go to the IAM console to create a role and a policy for my Lambda function that will provide permissions for DynamoDB and Elasticsearch.

First, I will create a policy based upon an existing managed policy for Lambda execution with DynamoDB Streams.

This will take us to the Review Policy screen, which will have the selected managed policy details. I’ll name this policy, Onboarding-LambdaDynamoDB-toElasticsearch, and then customize the policy for my solution. The first thing you should notice is that the current policy allows access to all streams, however, the best practice would be to have this policy only access the specific DynamoDB Stream by adding the Latest Stream ARN. Hence, I will alter the policy and add the ARN for the DynamoDB table, OnboardingEmployeeData, and validate the policy. The altered policy is as shown below.

The only thing left is to add the Amazon Elasticsearch Service permissions in the policy. The core policy for Amazon Elasticsearch Service access permissions is as shown below:

 

I will use this policy and add the specific Elasticsearch domain ARN as the Resource for the policy. This ensures that I have a policy that enforces the Least Privilege security best practice for policies. With the Amazon Elasticsearch Service domain added as shown, I can validate and save the policy.

The best way to create a custom policy is to use the IAM Policy Simulator or view the examples of the AWS service permissions from the service documentation. You can also find some examples of policies for a subset of AWS Services here. Remember you should only add the ES permissions that are needed using the Least Privilege security best practice, the policy shown above is used only as an example.

We will create the role for our Lambda function to use to grant access and attach the aforementioned policy to the role.

AWS Lambda: DynamoDB triggered Lambda function

AWS Lambda is the core of Amazon Web Services serverless computing offering. With Lambda, you can write and run code using supported languages for almost any type of application or backend service. Lambda will trigger your code in response to events from AWS services or from HTTP requests. Lambda will dynamically scale based upon workload and you only pay for your code execution.

We will have DynamoDB streams trigger a Lambda function that will create an index and send data to Elasticsearch. Another option for this is to use the Logstash plugin for DynamoDB. However, since several of the Logstash processors are now included in Elasticsearch 5.1 core and with the improved performance optimizations, I will opt to use Lambda to process my DynamoDB stream and load data to Amazon Elasticsearch Service.
Now let us head over to the AWS Lambda console and create the lambda function for loading employee data to Amazon Elasticsearch Service.

Once in the console, I will create a new Lambda function by selecting the Blank Function blueprint that will take me to the Configure Trigger page. Once on the trigger page, I will select DynamoDB as the AWS service which will trigger Lambda, and I provide the following trigger related options:

  • Table: OnboardingEmployeeData
  • Batch size: 100 (default)
  • Starting position: Trim Horizon

I hit Next button, and I am on the Configure Function screen. The name of my function will be ESEmployeeLoad and I will write this function in Node.4.3.

The Lambda function code is as follows:

var AWS = require('aws-sdk');
var path = require('path');

//Object for all the ElasticSearch Domain Info
var esDomain = {
    region: process.env.RegionForES,
    endpoint: process.env.EndpointForES,
    index: process.env.IndexForES,
    doctype: 'onboardingrecords'
};
//AWS Endpoint from created ES Domain Endpoint
var endpoint = new AWS.Endpoint(esDomain.endpoint);
//The AWS credentials are picked up from the environment.
var creds = new AWS.EnvironmentCredentials('AWS');

console.log('Loading function');
exports.handler = (event, context, callback) => {
    //console.log('Received event:', JSON.stringify(event, null, 2));
    console.log(JSON.stringify(esDomain));
    
    event.Records.forEach((record) => {
        console.log(record.eventID);
        console.log(record.eventName);
        console.log('DynamoDB Record: %j', record.dynamodb);
       
        var dbRecord = JSON.stringify(record.dynamodb);
        postToES(dbRecord, context, callback);
    });
};

function postToES(doc, context, lambdaCallback) {
    var req = new AWS.HttpRequest(endpoint);

    req.method = 'POST';
    req.path = path.join('/', esDomain.index, esDomain.doctype);
    req.region = esDomain.region;
    req.headers['presigned-expires'] = false;
    req.headers['Host'] = endpoint.host;
    req.body = doc;

    var signer = new AWS.Signers.V4(req , 'es');  // es: service code
    signer.addAuthorization(creds, new Date());

    var send = new AWS.NodeHttpClient();
    send.handleRequest(req, null, function(httpResp) {
        var respBody = '';
        httpResp.on('data', function (chunk) {
            respBody += chunk;
        });
        httpResp.on('end', function (chunk) {
            console.log('Response: ' + respBody);
            lambdaCallback(null,'Lambda added document ' + doc);
        });
    }, function(err) {
        console.log('Error: ' + err);
        lambdaCallback('Lambda failed with error ' + err);
    });
}

The Lambda function Environment variables are:

I will select an Existing role option and choose the ESOnboardingSystem IAM role I created earlier.

Upon completing my IAM role permissions for the Lambda function, I can review the Lambda function details and complete the creation of ESEmployeeLoad function.

I have completed the process of building my Lambda function to talk to Elasticsearch, and now I test my function my simulating data changes to my database.

Now my function, ESEmployeeLoad, will execute upon changes to the data in my database from my onboarding system. Additionally, I can review the processing of the Lambda function to Elasticsearch by reviewing the CloudWatch logs.

Now I can alter my Lambda function to take advantage of the new features or go directly to Elasticsearch and utilize the new Ingest Mode. An example of this would be to implement a pipeline for my Employee record documents.

I can replicate this function for handling the badge updates to the employee record, and/or leverage other preprocessors against the employee data. For instance, if I wanted to do a search of data based upon a data parameter in the Elasticsearch document, I could use the Search API and get records from the dataset.

The possibilities are endless, and you can get as creative as your data needs dictate while maintaining great performance.

Amazon Elasticsearch Service: Kibana 5.1

All Amazon Elasticsearch Service domains using Elasticsearch 5.1 are bundled with Kibana 5.1, the latest version of the open-source visualization tool.

The companion visualization and analytics platform, Kibana, has also been enhanced in the Kibana 5.1 release. Kibana is used to view, search or and interact with Elasticsearch data with a myriad of different charts, tables, and maps.  In addition, Kibana performs advanced data analysis of large volumes of the data. Key enhancements of the Kibana release are as follows:

  • Visualization tool new design: Updated color scheme and maximization of screen real-estate
  • Timelion: visualization tool with a time-based query DSL
  • Console: formerly known as Sense is now part of the core, using the same configuration for free-form requests to Elasticsearch
  • Scripted field language: ability use new Painless scripting language in the Elasticsearch cluster
  • Tag Cloud Visualization: 5.1 adds a word base graphical view of data sized by importance
  • More Charts: return of previously removed charts and addition of advanced view for X-Pack
  • Profiler UI:1 provides an enhancement to profile API with tree view
  • Rendering performance improvement: Discover performance fixes, decrease of CPU load

Summary

As you can see this release is expansive with many enhancements to assist customers in building Elasticsearch solutions. Amazon Elasticsearch Service now supports 15 new Elasticsearch APIs and 6 new plugins. Amazon Elasticsearch Service supports the following operations for Elasticsearch 5.1:

You can read more about the supported operations for Elasticsearch in the Amazon Elasticsearch Developer Guide, and you can get started by visiting the Amazon Elasticsearch Service website and/or sign into the AWS Management Console.

Tara

 

AWS Organizations – Policy-Based Management for Multiple AWS Accounts

by Jeff Barr | on | in AWS Organizations, Launch | | Comments

Over the years I have found that many of our customers are managing multiple AWS accounts. This situation can arise for several reasons. Sometimes they adopt AWS incrementally and organically, with individual teams and divisions making the move to cloud computing on a decentralized basis. Other companies grow through mergers and acquisitions and take on responsibility for existing accounts. Still others routinely create multiple accounts in order to meet strict guidelines for compliance or to create a very strong isolation barrier between applications, sometimes going so far as to use distinct accounts for development, testing, and production.

As these accounts proliferate, our customers find that they would like to manage them in a scalable fashion. Instead of dealing with a multitude of per-team, per-division, or per-application accounts, they have asked for a way to define access control policies that can be easily applied to all, some, or individual accounts. In many cases, these customers are also interested in additional billing and cost management, and would like to be able to control how AWS pricing benefits such as volume discounts and Reserved Instances are applied to their accounts.

AWS Organizations Emerges from Preview
To support this increasingly important use case, we are moving AWS Organizations from Preview to General Availability today. You can use Organizations to centrally manage multiple AWS accounts, with the ability to create a hierarchy of Organizational Units (OUs), assign each account to an OU, define policies, and then apply them to the entire hierarchy, to select OUs, or to specific accounts. You can invite existing AWS accounts to join your organization and you can also create new accounts. All of these functions are available from the AWS Management Console, the AWS Command Line Interface (CLI), and through the AWS Organizations API.

Here are some important terms and concepts that will help you to understand Organizations (this assumes that you are the all-powerful, overall administrator of your organization’s AWS accounts, and that you are responsible for the Master account):

An Organization is a consolidated set of AWS accounts that you manage. Newly-created Organizations offer the ability to implement sophisticated, account-level controls such as Service Control Policies. This allows Organization administrators to manage lists of allowed and blocked AWS API functions and resources that place guard rails on individual accounts. For example, you could give your advanced R&D team access to a wide range of AWS services, and then be a bit more cautious with your mainstream development and test accounts. Or, on the production side, you could allow access only to AWS services that are eligible for HIPAA compliance.

Some of our existing customers use a feature of AWS called Consolidated Billing. This allows them to select a Payer Account which rolls up account activity from multiple AWS Accounts into a single invoice and provides a centralized way of tracking costs. With this launch, current Consolidated Billing customers now have an Organization that provides all the capabilities of Consolidated Billing, but by default does not have the new features (like Service Control Policies) we’re making available today. These customers can easily enable the full features of AWS Organizations. This is accomplished by first enabling the use of all AWS Organization features from the Organization’s master account and then having each member account authorize this change to the Organization. Finally, we will continue to support creating new Organizations that support only the Consolidated Billing capabilities. Customers that wish to only use the centralized billing features can continue to do so, without allowing the master account administrators to enforce the advanced policy controls on member accounts in the Organization.

An AWS account is a container for AWS resources.

The Master account is the management hub for the Organization and is also the payer account for all of the AWS accounts in the Organization. The Master account can invite existing accounts to join the Organization, and can also create new accounts.

Member accounts are the non-Master accounts in the Organization.

An Organizational Unit (OU) is a container for a set of AWS accounts. OUs can be arranged into a hierarchy that can be up to five levels deep. The top of the hierarchy of OUs is also known as the Administrative Root.

A Service Control Policy (SCP) is a set of controls that the Organization’s Master account can apply to the Organization, selected OUs, or to selected accounts. When applied to an OU, the SCP applies to the OU and to any other OUs beneath it in the hierarchy. The SCP or SCPs in effect for a member account specify the permissions that are granted to the root user for the account. Within the account, IAM users and roles can be used as usual. However, regardless of how permissive the user or the role might be, the effective set of permissions will never extend beyond what is defined in the SCP. You can use this to exercise fine-grained control over access to AWS services and API functions at the account level.

An Invitation is used to ask an AWS account to join an Organization. It must be accepted within 15 days, and can be extended via email address or account ID. Up to 20 Invitations can be outstanding at any given time. The invitation-based model allows you to start from a Master account and then bring existing accounts into the fold. When an Invitation is accepted, the account joins the Organization and all applicable policies become effective. Once the account has joined the Organization, you can move it to the proper OU.

AWS Organizations is appropriate when you want to create strong isolation boundaries between the AWS accounts that you manage. However, keep in mind that AWS resources (EC2 instances, S3 buckets, and so forth) exist within a particular AWS account and cannot be moved from one account to another. You do have access to many different cross-account AWS features including VPC peering, AMI sharing, EBS snapshot sharing, RDS snapshot sharing, cross-account email sending, delegated access via IAM roles, cross-account S3 bucket permissions, and cross-account access in the AWS Management Console.

Like consolidated billing, AWS Organizations also provides several benefits when it comes to the use of EC2 and RDS Reserved Instances. For billing purposes, all of the accounts in the Organization are treated as if they are one account and can receive the hourly cost benefit of an RI purchased by any other account in the same Organization (in order for this benefit to be applied as expected, the Availability Zone and other attributes of the RI must match the attributes of the EC2 or RDS instance).

Creating an Organization
Let’s create an Organization from the Console, create some Organizational Units, and then create some accounts. I start by clicking on Create organization:

Then I choose ENABLE ALL FEATURES and click on Create organization:

My Organization is ready in seconds:

I can create a new account by clicking on Add account, and then selecting Create account:

Then I supply the details (the IAM role is created in the new account and grants enough permissions for the account to be customized after creation):

Here’s what the console looks like after I have created Dev, Test, and Prod accounts:

At this point all of the accounts are at the top of the hierarchy:

In order to add some structure, I click on Organize accounts, select Create organizational unit (OU), and enter a name:

I do the same for a second OU:

Then I select the Prod account, click on Move accounts, and choose the Operations OU:

Next, I move the Dev and Test accounts into the Development OU:

At this point I have four accounts (my original one plus the three that I just created) and two OUs. The next step is to create one or more Service Control Policies by clicking on Policies and selecting Create policy. I can use the Policy Generator or I can copy an existing SCP and then customize it. I’ll use the Policy Generator. I give my policy a name and make it an Allow policy:

Then I use the Policy Generator to construct a policy that allows full access to EC2 and S3, and the ability to run (invoke) Lambda functions:

Remember, that this policy defines the full set of allowable actions within the account. In order to allow IAM users within the account to be able to use these actions, I would still need to create suitable IAM policies and attach them to the users (all within the member account). I click on Create policy and my policy is ready:

Then I create a second policy for development and testing. This one also allows access to AWS CodeCommit, AWS CodeBuild, AWS CodeDeploy, and AWS CodePipeline:

Let’s recap. I have created my accounts and placed them into OUs. I have created a policy for the OUs. Now I need to enable the use of policies, and attach the policy to the OUs. To enable the use of policies, I click on Organize accounts and select Home (this is not the same as the root because Organizations was designed to support multiple, independent hierarchies), and then click on the checkbox in the Root OU. Then I look to the right, expand the Details section, and click on Enable:

Ok, now I can put all of the pieces together! I click on the Root OU to descend in to the hierarchy, and then click on the checkbox in the Operations OU. Then I expand the Control Policies on the right and click on Attach policy:

Then I locate the OperationsPolicy and click on Attach:

Finally, I remove the FullAWSAccess policy:

I can also attach the DevTestPolicy to the Development OU.

All of the operations that I described above could have been initiated from the AWS Command Line Interface (CLI) or by making calls to functions such as CreateOrganization, CreateAccount, CreateOrganizationalUnit, MoveAccount, CreatePolicy, AttachPolicy, and InviteAccountToOrganization. To see the CLI in action, read Announcing AWS Organizations: Centrally Manage Multiple AWS Accounts.

Best Practices for Use of AWS Organizations
Before I wrap up, I would like to share some best practices for the use of AWS Organizations:

Master Account – We recommend that you keep the Master Account free of any operational AWS resources (with one exception). In addition to making it easier for you to make high-quality control decision, this practice will make it easier for you to understand the charges on your AWS bill.

CloudTrail – Use AWS CloudTrail (this is the exception) in the Master Account to centrally track all AWS usage in the Member accounts.

Least Privilege – When setting up policies for your OUs, assign as few privileges as possible.

Organizational Units – Assign policies to OUs rather than to accounts. This will allow you to maintain a better mapping between your organizational structure and the level of AWS access needed.

Testing – Test new and modified policies on a single account before scaling up.

Automation – Use the APIs and a AWS CloudFormation template to ensure that every newly created account is configured to your liking. The template can create IAM users, roles, and policies. It can also set up logging, create and configure VPCs, and so forth.

Learning More
Here are some resources that will help you to get started with AWS Organizations:

Things to Know
AWS Organizations is available today in all AWS regions except China (Beijing) and AWS GovCloud (US) and is available to you at no charge (to be a bit more precise, the service endpoint is located in US East (Northern Virginia) and the SCPs apply across all relevant regions). All of the accounts must be from the same seller; you cannot mix AWS and AISPL (the local legal Indian entity that acts as a reseller for AWS services accounts in India) in the same Organization.

We have big plans for Organizations, and are currently thinking about adding support for multiple payers, control over allocation of Reserved Instance discounts, multiple hierarchies, and other control policies. As always, your feedback and suggestions are welcome.

Jeff;

Now Available – I3 Instances for Demanding, I/O Intensive Applications

by Jeff Barr | on | in Amazon EC2, Launch | | Comments

On the first day of AWS re:Invent I published an EC2 Instance Update and promised to share additional information with you as soon as I had it.

Today I am happy to be able to let you know that we are making six sizes of our new I3 instances available in fifteen AWS regions! Designed for I/O intensive workloads and equipped with super-efficient NVMe SSD storage, these instances can deliver up to 3.3 million IOPS at a 4 KB block and up to 16 GB/second of sequential disk throughput. This makes them a great fit for any workload that requires high throughput and low latency including relational databases, NoSQL databases, search engines, data warehouses, real-time analytics, and disk-based caches. When compared to the I2 instances, I3 instances deliver storage that is less expensive and more dense, with the ability to deliver substantially more IOPS and more network bandwidth per CPU core.

The Specs
Here are the instance sizes and the associated specs:

Instance Name vCPU Count Memory
Instance Storage (NVMe SSD) Price/Hour
i3.large 2 15.25 GiB 0.475 TB $0.15
i3.xlarge 4 30.5 GiB 0.950 TB $0.31
i3.2xlarge 8 61 GiB 1.9 TB $0.62
i3.4xlarge 16 122 GiB 3.8 TB (2 disks) $1.25
i3.8xlarge 32 244 GiB 7.6 TB (4 disks) $2.50
i3.16xlarge 64 488 GiB 15.2 TB (8 disks) $4.99

The prices shown are for On-Demand instances in the US East (Northern Virginia) Region; see the EC2 pricing page for more information.

I3 instances are available in On-Demand, Reserved, and Spot form in the US East (Northern Virginia), US West (Oregon), US West (Northern California), US East (Ohio), Canada (Central), South America (São Paulo), EU (Ireland), EU (London), EU (Frankfurt), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Mumbai), Asia Pacific (Sydney), and AWS GovCloud (US) Regions. You can also use them as Dedicated Hosts and as Dedicated Instances.

These instances support Hardware Virtualization (HVM) AMIs only, and must be run within a Virtual Private Cloud. In order to benefit from the performance made possible by the NVMe storage, you must run one of the following operating systems:

  • Amazon Linux AMI
  • CentOS – 7.0 or better
  • Ubuntu – 16.04 or 16.10
  • SUSE 12
  • SUSE 11 with SP3
  • Windows Server 2008 R2, 2012 R2, and 2016

The I3 instances offer up to 8 NVMe SSDs. In order to achieve the best possible throughput and to get as many IOPS as possible, you can stripe multiple volumes together, or spread the I/O workload across them in another way.

Each vCPU (Virtual CPU) is a hardware hyperthread on an Intel E5-2686 v4 (Broadwell) processor running at 2.3 GHz. The processor supports the AVX2 instructions, along with Turbo Boost and NUMA.

Go For Launch
The I3 instances are available today in fifteen AWS regions and you can start to use them right now.

Jeff;