Category: Launch


Launch – Amazon Cognito User Pools General Availability: App Integration and Federation

Recently I was reading articles on Forbes.com, as well as, some other tech-focused websites around mobile user experience, engagement, and development. Almost every article, it mentions that the success of a mobile app development project is dependent upon the delivery of a well-designed user onboarding experience and an engaging mobile interface. An Inc.com article states that over 90% of all downloaded apps used only once and then removed.  The number 3 reason noted for users deleting mobile applications from their devices was a poor user experience and interface design. In addition, a subsequent article shares that one of the rules of mobile application engagement is to “Focus on quick wins during onboarding”.

Implementing a smooth mobile user experience is not easy, and I speak from experience as a developer who has built many mobile apps where each time I have struggled to build the user interface of the mobile application. Since identity is mission critical for applications, and it is usually the first entry point when onboarding users onto most mobile and web applications to present these capabilities in a fluid and seamless user interface. Therefore, I am exultant over AWS Cognito User PoolsApp Integration and Federation and thrilled to announce the general availability of this new service feature.

Just in case you have not taken advantage of Amazon Cognito as of yet, let me introduce you to the service. Amazon Cognito is a managed cloud service that allows you to add authentication, authorization, and user management to your web, mobile and even IoT applications.

Amazon Cognito features consists of:

  • Amazon Cognito User Pools: create and maintain a user directory in order to add sign-up and sign-in to your mobile app or web application. You can also sign in users to a user pool through social identity providers as well as, SAML-based providers
  • Amazon Cognito Federated Identities: enables the creation of unique identities for users and the ability to authenticate them with federated identity providers, such as Google or Facebook, for temporary, limited-privilege access to app resources
  • Amazon Cognito Sync: allows you to synchronize user profile data across mobile devices and the web without the need to build a backend. It supports offline access, cross-device synchronizing, and local data caching of application-related user data so the user app experience remains consistent regardless of the device.

With the General Availability of Amazon Cognito User PoolsApp Integration and Federation, we are now adding AWS-hosted user sign-up and sign-in UI pages to help web and mobile app developers effortlessly integrate and customize the onboarding user experience for their applications. In addition, when using the Cognito User pools user directory, you can enhance your mobile client login capabilities by providing a sign-up and sign-in for social identity providers including Facebook, Google, Login with Amazon, as well as through SAML with corporate identity providers such as Microsoft Active Directory.

The aspects of the Amazon Cognito User Pools – App Integration and Federation service features provided in this GA release are as follows:

App Integration with User Pools
Provide a hosted UI for sign up, sign in, forgot password, etc. Provide a New WebView for Mobile clients
Developers can customize the hosted UI to match their style and branding Enables usage of Custom logo and CSS styles

 

Federation with User Pools
Cognito handles interactions with identity providers to authenticate users and receive tokens Identity providers are configured in Cognito Ex. SAML metadata document, issuer URL, identifiers, and domains
Cognito User Pools act as a universal directory providing user profiles and authentication tokens for federated and “Cognito service users Supporting Identity Providers: SAML,  Facebook, Google, and Amazon

 

OAuth 2.0 Support
Cognito supports OAuth 2.0 as Industry standard protocol for authorization OAuth 2.0 Permissions are defined as “scopes” Ex. permission to read a user profile or edit photos
Client apps can request a set of scopes, and if permitted, get back an access token with those scopes Ex. If the request is in the context of a user, the user can be authenticated Client apps take the access token to a resource server to access the resources as permitted by the scopes

Since I can’t wait to try out these new features, let’s build a quick app using the new Cognito User Pool App Integration and Federation features. Therefore, off to the Cognito management console we go. Once in the console, I’ll quickly create a User Pool for our test by clicking the Manage your User Pools button.


Once in the User Pools console, I’ll click the Create a user pool button. Once in the Create a user pool screen, I’ll name my new user pool, TaraCognitoGAPool, and since I want to customize my hosted UI and take advantage of the other new features, I opt for creating my pool by selecting the Step Through the Settings button to complete the creation of my user pool.


On the Attributes tab, I have two sign-in options for my users; they can sign in via Username and I can additionally grant the user multiple alternatives of how to log into my application. The other option is to allow the user to make use of their Email address or phone number to sign in and I can provide them the ability to use both options or ensure only one of the aforementioned methods are used. Usually, I opt for the user to use Username with email, but since we are testing out the new features of App Integration and Federation today, I am going to select Email address or phone number and only allow the use of email addresses for sign-in/sign-up.

Next stop is the Policies and the Verifications tabs for which I will keep all of the default options selected on both of the tabs. On the Verification tab, however, I do want to provide the option for Cognito to send SMS messages on my behalf. Therefore, I will choose the Create Role button and allow an IAM role to be created granting SMS permissions. I complete the selection of options for my user pool options by clicking Save Changes button.

The last step is to go to the App clients tab and create an app client by clicking the Add an app client link on the page. I’ll name my app client, TaraCognitoGA-App, leave all the default options the same, and click the Create app client button.


All that is left is to review the TaraCognitoGAPool options and click Create pool button.


Great! Now that my user pool, TaraCognitoGAPool has been created. I can take advantage of the new App Integration and Federation features.  If you have created a user pool before, you will notice that the user pool screen now contains tabs for the new user pools features in the menu side bar.

This is what we’ve been waiting for. Now I will go into App integration tab in order to configure settings to have my own customized, built-in UI for signing up and signing in users to my TaraCognitoGA-App.

First I’ll go into my App client settings under the App Integration tab. Here I will enable the identity providers I want to allow users of my application to use when signing in. Since I have only enabled Cognito User Pools as an identity provider, it is currently the only identity provider option. If I want to allow users to sign in with external identity providers like Facebook or a SAML provider I will have to configure them with Federation. We’ll discuss this shortly.

For now, I’ll  enter the callback URL that my app should go to once the user has been successful in logging in, and the URL that the app should return to once the user has logged out. I’ve created a quick S3 website to use with my Cognito new sign-in. For more information on these options please see:  Specifying Identity Provider Settings for Your User Pool App in the Cognito developer guide.


Now I’ll to go to the Domain name option under the App Integration tab, and I’ll enter a domain prefix to be used for my sign-up and sign-in pages hosted by Cognito. Keeping with my current naming convention, I’ll name my domain, taracognitodomain-ga, and click the Check Availability button. Remember your domain name must be unique across the chosen AWS region and can only contain lowercase letters, numbers, and hyphens. Since my domain name is available, I will click Save changes and go to UI customization settings.


UI customization settings is where I can change the look and feel of the default AWS UI hosted sign-in for a specific app client or as the default look for all my app clients. I’ll select the app client I created earlier and upload my personal logo to be displayed on my page. You should note that you can customize the CSS for several fields and HTML tags for your page as well by selecting the options under the CSS customizations (optional) section and adding your CSS as desired.

After selecting the Save Changes button, I can now view my login page, which I can use for by web and mobile app clients. The hosted UI for your Cognito User Pool can be accessed by using a URL with the following pattern:

https://< your_domain >/login?response_type=code&client_id=< your_app_client_id >&redirect_uri=< your_callback>

Which would make my hosted URL as following (my client id is obfuscated):

Clicking on this link displays my custom sign-in and sign-up page hosted by AWS Cognito User Pools with my custom logo presented. How exciting!

We’re cooking with gas! But wait, I promised that I would discuss how to tie Federation and external federated identity providers to your Cognito User Pool. I’m sure you thought that I had forgotten. No worries, I’ll discuss Federation now.

Configuring Federation with a social and/or a SAML identity provider is pretty easy. With Cognito’s built-in integrations, you no longer have to integrate multiple identity provider SDKs or handle redirects or post backs in your app. Cognito handles the identity provider interactions for you and creates user profiles for federated users in Cognito User Pools.

However, before I show you how to configure a Cognito Federated identity provider, I actually need one to show. Given this, I ran over to the Login with Amazon page and created an app so I can use it as one of my federated identity providers. Sure, I could have done Facebook or Google but everyone does those identity providers, and hey, we all need a little variety in our lives.


With my Login with Amazon app id and app secret in hand, I’ll return to the Cognito User Pool console and go to the Federation tab on the menu side bar. I’ll select the identity providers option, and here I am presented with all the identity providers supported for sign in with Cognito User Pools. Here I will select the Login with Amazon identity provider.


Now I will enter my app ID/client ID and app secret provided by the Login with Amazon service into the Amazon app ID and App secret fields. I also can determine what scopes and related data are authorized by this login. For this sample, I have chosen to enter profile, postal_code, and email in the Authorize scope field.

All that is left is for me to select the Enable Login with Amazon button, and I have successfully added the identity provider for the Login with Amazon identity service.


My final step is to go to the Attribute mapping section also under the Federation section. Here I will select the Amazon tab and map the Login with Amazon attributes to the attributes captured by Cognito User Pool. Once I hit that Save changes button, I have successfully added the Login with Amazon identity provider.

If I go back to App client settings and enable Login with Amazon provider by checking the provider related check box, and return to my Cognito UI hosted login page; I now see that Cognito has successfully added the Login with Amazon to my sign in page.

 

Summary

Fantastic! Now as a developer, I can focus on making my app experience as smooth and engaging as possible, including a simple, customized sign in process for my app users without the heavy lifting typically needed to implement a sign in screen with social and SAML identity providers.

AWS Cognito User PoolsApp Integration and Federation enables web and mobile app developers to easily integrate and customize a user experience for users to sign-up and sign-in though AWS-hosted web pages. Additionally, it simplifies user management by providing a unified user authentication and authorization mechanism whether using Cognito User Pools as a user directory and/or other identity providers including Facebook, Google, Login with Amazon, as well as, corporate SAML providers like Microsoft Active Directory.  Learn more about this great service by checking out the Amazon Cognito product page or the Amazon Cognito developer guide.

Enjoy!

Tara

Launch – Hello Amazon Macie: Automatically Discover, Classify, and Secure Content at Scale

When Jeff and I heard about this service, we both were curious on the meaning of the name Macie. Of course, Jeff being a great researcher looked up the name Macie and found that the name Macie has two meanings. It has both French and English (UK) based origin, it is typically a girl name, has various meanings. The first meaning of Macie that was found, said that that name meant “weapon”.  The second meaning noted the name was representative of a person that is bold, sporty, and sweet. In a way, these definitions are appropriate, as today I am happy to announce that we are launching  Amazon Macie, a new security service that uses machine learning to help identify and protect sensitive data stored in AWS from breaches, data leaks, and unauthorized access with Amazon Simple Storage Service (S3) being the initial data store. Therefore, I can imagine that Amazon Macie could be described as a bold, weapon for AWS customers providing a sweet service with a sporty user interface that helps to protects against malicious access of your data at rest. Whew, that was a mouthful, but I unbelievably got all the Macie descriptions out in a single sentence! Nevertheless, I am a thrilled to share with you the power of the new Amazon Macie service.

Amazon Macie is a service powered by machine learning that can automatically discover and classify your data stored in Amazon S3. But Macie doesn’t stop there, once your data has been classified by Macie, it assigns each data item a business value, and then continuously monitors the data in order to detect any suspicious activity based upon access patterns. Key features of the Macie service include:

  • Data Security Automation: analyzes, classifies, and processes data to understand the historical patterns, user authentications to data, data access locations, and times of access.
  • Data Security & Monitoring: actively monitors usage log data for anomaly detected along with automatic resolution of reported issues through CloudWatch Events and Lambda
  • Data Visibility for Proactive Loss prevention: Provides management visibility into details of storage data while providing immediate protection without the need for manual customer input
  • Data Research and Reporting: allows administrative configuration for reporting and alert management requirements

How does Amazon Macie accomplish this you ask? 

Using machine learning algorithms for natural language processing (NLP), Macie can automate the classification of data in your S3 buckets. In addition, Amazon Macie takes advantage of predictive analytics algorithms enabling data access patterns to be dynamically analyzed. Learnings are then used to inform and to alert you on possible suspicious behavior. Macie also runs an engine specifically to detect common sources of personally identifiable information (PII), or sensitive personal information (SP).  Macie takes advantage of AWS CloudTrail and continuously checks Cloudtrail events for PUT requests in S3 buckets and automatically classify new objects in almost real time.

While Macie is a powerful tool to use for security and data protection in the AWS cloud, it also can aid you with governance, compliance requirements, and/or audit standards.  Many of you may already be aware of the EU’s most stringent privacy regulation to date – The General Protection Data Regulation (GDPR), which becomes enforceable on May 25, 2018. As Amazon Macie recognizes personally identifiable information (PII) and provides customers with dashboards and alerts, it will enable customers to comply with GDPR regulations around encryption and pseudonymization of data. When combined with Lambda queries, Macie becomes a powerful tool to help remediate GDPR concerns.

Tour of the Amazon Macie Service

Let’s look a tour of the service and look at Amazon Macie up close and personal.

First, I will log onto the Macie console and start the process of setting up Macie so that I can start to my data classification and protection by clicking the Get Started button.


As you can see, to enable the Amazon Macie service, I must have the appropriate IAM roles created for the service, and additionally I will need to have AWS CloudTrail enabled in my account.

I will create these roles and turn on the AWS CloudTrail service in my account. To make things easier for you to setup Macie, you can take advantage of sample template for CloudFormation provided in the Macie User Guide that will set up required IAM roles and policies for you, you then would only need to setup a trail as noted in the CloudTrail documentation.

If you have multiple AWS accounts, you should note that the account you use to enable the Macie service will be noted as the master account, you can integrate other accounts with the Macie service but they will have the member account designation. Users from member accounts will need to use an IAM role to federate access to the master account in order access the Macie console.

Now that my IAM roles are created and CloudTrail is enabled, I will click the Enable Macie button to start Macie’s data monitoring and protection.


Once Macie is finished starting the service in your account, you will be brought to the service main screen and any existing alerts in your account will be presented to you. Since I have just started the service, I currently have no existing alerts at this time.


Considering we are doing a tour of the Macie service, I will now integrate some of my S3 buckets with Macie. However, you do not have to specify any S3 buckets for Macie to start monitoring since the service already uses the AWS CloudTrail Management API analyze and process information. With this tour of Macie, I have decided to monitor some object level API events in from certain buckets in CloudTrail.

In order to integrate with S3, I will go to the Integrations tab of the Macie console.  Once on the Integrations tab, I will see two options: Accounts and Services. The Account option is used to integrate member accounts with Macie and to set your data retention policy. Since I want to integrate specific S3 buckets with Macie, I’ll click the Services option go to the Services tab.


When I integrate Macie with the S3 service, a trail and a S3 bucket will be created to store logs about S3 data events. To get started, I will use the Select an account drop down to choose an account.  Once my account is selected, the services available for integration are presented. I’ll select the Amazon S3 service by clicking the Add button.

Now I can select the buckets that I want Macie to analyze, selecting the Review and Save button takes me to a screen which I confirm that I desire object level logging by clicking Save button.

4
Next, on our Macie tour, let’s look at how we can customize data classification with Macie.

As we discussed, Macie will automatically monitor and classify your data. Once Macie identifies your data it will classify your data objects by file and content type. Macie will also use a support vector machine (SVM) classifier to classify the content within S3 objects in addition to the metadata of the file. In deep learning/machine learning fields of study, support vector machines are supervised learning models, which have learning algorithms used for classification and regression analysis of data. Macie trained the SVM classifier by using a data of varying content types optimized to support accurate detection of data content even including the source code you may write.

Macie will assign only one content type per data object or file, however, you have the ability to enable or disable content type and file extensions in order to include or exclude them from the Macie service classifying these objects. Once Macie classifies the data, it will assign risk level of the object between 1 and 10 with 10 being the highest risk and 1 being the lowest data risk level.

To customize the classification of our data with Macie, I’ll go to the Settings Tab. I am now presented with the choices available to enable or disable the Macie classifications settings.


For an example during our tour of Macie, I will choose File extension. When presented with the list of file extensions that Macie tracks and uses for classifications.

As a test, I’ll edit the apk file extension for Android application install file, and disable monitoring of this file by selecting No – disabled from the dropdown and clicking the Save button. Of course, later I will turn this back on since I want to keep my entire collection of data files safe including my Android development binaries.


One last thing I want to note about data classification using Macie is that the service provides visibility in how you data object are being classified and highlights data assets that you have stored regarding how critical or important the information for compliance, for your personal data and for your business.

Now that we have explored the data that Macie classifies and monitors, the last stop on our service tour is the Macie dashboard.

 

The Macie Dashboard provides us with a complete picture of all of the data and activity that has been gathered as Macie monitors and classifies our data. The dashboard displays Metrics and Views grouped by categories to provide different visual perspectives of your data. Within these dashboard screens, you also you can go from a metric perspective directly to the Research tab to build and run queries based on the metric. These queries can be used to set up customized alerts for notification of any possible security issues or problems. We won’t have an opportunity to tour the Research or Alerts tab, but you can find out more information about these features in the Macie user guide.

Turning back to the Dashboard, there are so many great resources in the Macie Dashboard that we will not be able to stop at each view, metric, and feature during our tour, so let me give you an overview of all the features of the dashboard that you can take advantage of using.

Dashboard Metrics monitored data grouped by the following categories:

  • High-risk S3 objects: data objects with risk levels of 8 through 10.
  • Total event occurrences: – total count of all event occurrences since Macie was enabled
  • Total user sessions – 5-minute snapshot of CloudTrail data

Dashboard Views – views to display various points of the monitored data and activity:

  • S3 objects for a selected time range
  • S3 objects
  • S3 objects by personally identifiable information (PII)
  • S3 objects by ACL
  • CloudTrail events and associated users
  • CloudTrail errors and associated users
  • Activity location
  • AWS CLoudTrail events
  • Activity ISPs
  • AWS CloudTrail user identity types

Summary

Well, that concludes our tour of the new and exciting Amazon Macie service. Amazon Macie is a sensational new service that uses the power of machine learning and deep learning to aid you in securing, identifying, and protecting your data stored in Amazon S3. Using natural language processing (NLP) to automate data classification, Amazon Macie enables you to easily get started with high accuracy classification and immediate protection of your data by simply enabling the service.  The interactive dashboards give visibility to the where, what, who, and when of your information allowing you to proactively analyze massive streams of data, data accesses, and API calls in your environment. Learn more about Amazon Macie by visiting the product page or the documentation in the Amazon Macie user guide.

Tara

New – Amazon Web Services Extends CloudTrail to All AWS Customers

I have exciting news for all Amazon Web Services customers! I have been waiting patiently to share this great news with all of you and finally, the wait is over. AWS CloudTrail is now enabled by default for ALL CUSTOMERS and will provide visibility into the past seven days of account activity without the need for you to configure a trail in the service to get started. This new ‘always on’ capability provides the ability to view, search, and download the aforementioned account activity through the CloudTrail Event History.

For those of you that haven’t taken advantage of AWS CloudTrail yet, let me explain why I am thrilled to have this essential service for operational troubleshooting and review, compliance, auditing and security, turned by default for all AWS Accounts.

AWS CloudTrail captures account activity and events for supported services made in your AWS account and sends the event log files to Amazon Simple Storage Service (S3), CloudWatch Logs, and CloudWatch Events. With CloudTrail, you typically create a trail, a configuration enabling logging of account activity and events. CloudTrail, then, fast tracks your ability to analyze operational and security issues by providing visibility into the API activity happening in your AWS account. CloudTrail supports multi-region configurations and when integrated with CloudWatch you can create triggers for events you want to monitor or create a subscription to send activity to AWS Lambda. Taking advantage of the CloudTrail service means that you have a searchable historical record of data of calls made from your account from other AWS services, from the AWS Command Line Interface (CLI), the AWS Management Console, and AWS SDKs.

The key features of AWS CloudTrail are:

  • Always On: enabled on all AWS accounts and records your account activity upon account creation without the need to configure CloudTrail
  • Event History: view, search, and download your recent AWS account activity
  • Management Level Events: get details administrative actions such as creation, deletion, and modification of EC2 instances or S3 buckets
  • Data Level Events: record all API actions on Amazon S3 objects and receive detailed information about API actions
  • Log File Integrity Validation: validate the integrity of log files stored in your S3 bucket
  • Log File Encryption: service encrypts all log files by default delivered to your S3 bucket using S3 server-side encryption (SSE). Option to encrypt log files with AWS Key Management Service (AWS KMS) as well
  • Multi-region Configuration: configure service to deliver log files from multiple regions

You can read more about the features of AWS CloudTrail on the product detail page.

As my colleague, Randall Hunt, reminded me: CloudTrail is essential when helping customers to troubleshoot their solutions. What most AWS resources, like those of us on the Technical Evangelist team or the great folks on the Solutions Architect team, will say is “Enable CloudTrail” so we can examine the details of what’s going on. Therefore, it’s no wonder that I am ecstatic to share that with this release, all AWS customers can view account activity by using the AWS console or the AWS CLI/API, including the ability to search and download seven days of account activity for operations of all supported services.

With CloudTrail being enabled by default, all AWS customers can now log into CloudTrail and review their Event History. In this view, not only do you see the last seven days of events, but you can also select an event to view more information about it.

Of course, if you want to access your CloudTrail log files directly or archive your logs for auditing purposes, you can still create a trail and specify the S3 bucket for your log file delivery. Creating a trail also allows you to deliver events to CloudWatch Logs and CloudWatch Events, and is a very easy process.

After logging into the CloudTrail console, you would simply click the Create a trail button.


You then would enter a trail name in the Trail name text box and select the radio button for the option of applying your trail configuration to all regions or only for the region you are currently in. For this example, I’ll name my trail, TEW-USEast-Region-Trail, and select No for the Apply trail to all regions, radio button. This means that this trail will only track events and activities in the current region, which right now is US-East (N. Virginia).  Please note: A best practice is to select Yes to the Apply trail to all regions option to ensure that you will capture all events related to your AWS account, including global service events.


Under Management events, I select the Read/Write events radio button option for which operations I want CloudTrail to track. In this case, I will select the All option.

Next step is for me to select the S3 buckets for which I desire to track the S3 object-level operations. This is an optional step, but note that by default trails do not log Data Events. Therefore, if you want to track the S3 object event activity you can configure your trail to track Data Events for objects in the bucket you specify in the Data events section. I’ll select my aws-blog-tew-posts S3 bucket, and keep the default option to track all Read/Write operations.


My final step in the creation of my trail is to select a S3 bucket in the Storage Location section of the console for where I wish to house my CloudTrail logs. I can either have CloudTrail create a new bucket on my behalf or select an existing bucket in my account. I will opt to have CloudTrail create a new bucket for me so I will enter a unique bucket name of tew-cloudtrail-logbucket in the text box. I want to make sure that I can find my logs easily so I will expand the Advanced section of the Storage Location and add a prefix. This is most helpful when you want to add search criteria to logs being stored in your bucket. For my prefix, I will just enter tew-2017. I’ll keep the default selections for the other Advanced options shown which include choices for; Encrypt log files, Enable log file validation, and Send SNS notification for every log file delivery.

That’s it! Once I click the Create button, I have successfully created a trail for AWS CloudTrail.

 

Ready to get started?

You can learn more about AWS CloudTrail by visiting the service product page, the CloudTrail documentation, and/or AWS CloudTrail frequently asked questions. Head over to the CloudTrail service console to view and search your CloudTrail events, with or without a trail configured.

Enjoy the new launch of CloudTrail for All AWS Customers, and all the goodness that you will get from taking advantage of this great service!

Tara

Launch – .NET Core Support In AWS CodeStar and AWS Codebuild

A few months ago, I introduced the AWS CodeStar service, which allows you to quickly develop, build, and deploy applications on AWS. AWS CodeStar helps development teams to increase the pace of releasing applications and solutions while reducing some of the challenges of building great software.

When the CodeStar service launched in April, it was released with several project templates for Amazon EC2, AWS Elastic Beanstalk, and AWS Lambda using five different programming languages; JavaScript, Java, Python, Ruby, and PHP. Each template provisions the underlying AWS Code Services and configures an end-end continuous delivery pipeline for the targeted application using AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and AWS CodeDeploy.

As I have participated in some of the AWS Summits around the world discussing AWS CodeStar, many of you have shown curiosity in learning about the availability of .NET templates in CodeStar and utilizing CodeStar to deploy .NET applications. Therefore, it is with great pleasure and excitement that I announce that you can now develop, build, and deploy cross-platform .NET Core applications with the AWS CodeStar and AWS CodeBuild services.

AWS CodeBuild has added the ability to build and deploy .NET Core application code to both Amazon EC2 and AWS Lambda. This new CodeBuild capability has enabled the addition of two new project templates in AWS CodeStar for .NET Core applications.  These new project templates enable you to deploy .NET Code applications to Amazon EC2 Linux Instances, and provides everything you need to get started quickly, including .NET Core sample code and a full software development toolchain.

Of course, I can’t wait to try out the new addition to the project templates within CodeStar and the update .NET application build options with CodeBuild. For my test scenario, I will use CodeStar to create, build, and deploy my .NET Code ASP.Net web application on EC2. Then, I will extend my ASP.Net application by creating a .NET Lambda function to be compiled and deployed with CodeBuild as a part of my application’s pipeline. This Lambda function can then be called and used within my ASP.Net application to extend the functionality of my web application.

So, let’s get started!

First, I’ll log into the CodeStar console and start a new CodeStar project. I am presented with the option to select a project template.


Right now, I would like to focus on building .NET Core projects, therefore, I’ll filter the project templates by selecting the C# in the Programming Languages section. Now, CodeStar only shows me the new .NET Core project templates that I can use to build web applications and services with ASP.NET Core.

I think I’ll use the ASP.NET Core web application project template for my first CodeStar .NET Core application. As you can see by the project template information display, my web application will be deployed on Amazon EC2, which signifies to me that my .NET Core code will be compiled and packaged using AWS CodeBuild and deployed to EC2 using the AWS CodeDeploy service.


My hunch about the services is confirmed on the next screen when CodeStar shows the AWS CodePipeline and the AWS services that will be configured for my new project. I’ll name this web application project, ASPNetCore4Tara, and leave the default Project ID that CodeStar generates from the project name. Yes, I know that this is one of the goofiest names I could ever come up with, but, hey, it will do for this test project so I’ll go ahead and click the Next button. I should mention that you have the option to edit your Amazon EC2 configuration for your project on this screen before CodeStar starts configuring and provisioning the services needed to run your application.

Since my ASP.Net Core web application will be deployed to an Amazon EC2 instance, I will need to choose an Amazon EC2 Key Pair for encryption of the login used to allow me to SSH into this instance. For my ASPNetCore4Tara project, I will use an existing Amazon EC2 key pair I have previously used for launching my other EC2 instances. However, if I was creating this project and I did not have an EC2 key pair or if I didn’t have access to the .pem file (private key file) for an existing EC2 key pair, I would have to first visit the EC2 console and create a new EC2 key pair to use for my project. This is important because if you remember, without having the EC2 key pair with the associated .pem file, I would not be able to log into my EC2 instance.

With my EC2 key pair selected and confirmation that I have the related private file checked, I am ready to click the Create Project button.


After CodeStar completes the creation of the project and the provisioning of the project related AWS services, I am ready to view the CodeStar sample application from the application endpoint displayed in the CodeStar dashboard. This sample application should be familiar to you if have been working with the CodeStar service or if you had an opportunity to read the blog post about the AWS CodeStar service launch. I’ll click the link underneath Application Endpoints to view the sample ASP.NET Core web application.

Now I’ll go ahead and clone the generated project and connect my Visual Studio IDE to the project repository. I am going to make some changes to the application and since AWS CodeBuild now supports .NET Core builds and deployments to both Amazon EC2 and AWS Lambda, I will alter my build specification file appropriately for the changes to my web application that will include the use of the Lambda function.  Don’t worry if you are not familiar with how to clone the project and connect it to the Visual Studio IDE, CodeStar provides in-console step-by-step instructions to assist you.

First things first, I will open up the Visual Studio IDE and connect to AWS CodeCommit repository provisioned for my ASPNetCore4Tara project. It is important to note that the Visual Studio 2017 IDE is required for .NET Core projects in AWS CodeStar and the AWS Toolkit for Visual Studio 2017 will need to be installed prior to connecting your project repository to the IDE.

In order to connect to my repo within Visual Studio, I will open up Team Explorer and select the Connect link under the AWS CodeCommit option under Hosted Service Providers. I will click Ok to keep my default AWS profile toolkit credentials.

I’ll then click Clone under the Manage Connections and AWS CodeCommit hosted provider section.

Once I select my aspnetcore4tara repository in the Clone AWS CodeCommit Repository dialog, I only have to enter my IAM role’s HTTPS Git credentials in the Git Credentials for AWS CodeCommit dialog and my process is complete. If you’re following along and receive a dialog for Git Credential Manager login, don’t worry just your enter the same IAM role’s Git credentials.


My project is now connected to the aspnetcore4tara CodeCommit repository and my web application is loaded to editing. As you will notice in the screenshot below, the sample project is structured as a standard ASP.NET Core MVC web application.

With the project created, I can make changes and updates. Since I want to update this project with a .NET Lambda function, I’ll quickly start a new project in Visual Studio to author a very simple C# Lambda function to be compiled with the CodeStar project. This AWS Lambda function will be included in the CodeStar ASP.NET Core web application project.

The Lambda function I’ve created makes a call to the REST API of NASA’s popular Astronomy Picture of the Day website. The API sends back the latest planetary image and related information in JSON format. You can see the Lambda function code below.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

using System.Net.Http;
using Amazon.Lambda.Core;

// Assembly attribute to enable the Lambda function's JSON input to be converted into a .NET class.
[assembly: LambdaSerializer(typeof(Amazon.Lambda.Serialization.Json.JsonSerializer))]

namespace NASAPicOfTheDay
{
    public class SpacePic
    {
        HttpClient httpClient = new HttpClient();
        string nasaRestApi = "https://api.nasa.gov/planetary/apod?api_key=DEMO_KEY";

        /// <summary>
        /// A simple function that retreives NASA Planetary Info and 
        /// Picture of the Day
        /// </summary>
        /// <param name="context"></param>
        /// <returns>nasaResponse-JSON String</returns>
        public async Task<string> GetNASAPicInfo(ILambdaContext context)
        {
            string nasaResponse;
            
            //Call NASA Picture of the Day API
            nasaResponse = await httpClient.GetStringAsync(nasaRestApi);
            Console.WriteLine("NASA API Response");
            Console.WriteLine(nasaResponse);
            
            //Return NASA response - JSON format
            return nasaResponse; 
        }
    }
}

I’ll now publish this C# Lambda function and test by using the Publish to AWS Lambda option provided by the AWS Toolkit for Visual Studio with NASAPicOfTheDay project. After publishing the function, I can test it and verify that it is working correctly within Visual Studio and/or the AWS Lambda console. You can learn more about building AWS Lambda functions with C# and .NET at: http://docs.aws.amazon.com/lambda/latest/dg/dotnet-programming-model.html

 

Now that I have my Lambda function completed and tested, all that is left is to update the CodeBuild buildspec.yml file within my aspnetcore4tara CodeStar project to include publishing and deploying of the Lambda function.

To accomplish this, I will create a new folder named functions and copy the folder that contains my Lambda function .NET project to my aspnetcore4tara web application project directory.

 

 

To build and publish my AWS Lambda function, I will use commands in the buildspec.yml file from the aws-lambda-dotnet tools library, which helps .NET Core developers develop AWS Lambda functions. I add a file, funcprof, to the NASAPicOfTheDay folder which contains customized profile information for use with aws-lambda-dotnet tools. All that is left is to update the buildspec.yml file used by CodeBuild for the ASPNetCore4Tara project build to include the packaging and the deployment of the NASAPictureOfDay AWS Lambda function. The updated buildspec.yml is as follows:

version: 0.2
phases:
  env:
  variables:
    basePath: 'hold'
  install:
    commands:
      - echo set basePath for project
      - basePath=$(pwd)
      - echo $basePath
      - echo Build restore and package Lambda function using AWS .NET Tools...
      - dotnet restore functions/*/NASAPicOfTheDay.csproj
      - cd functions/NASAPicOfTheDay
      - dotnet lambda package -c Release -f netcoreapp1.0 -o ../lambda_build/nasa-lambda-function.zip
  pre_build:
    commands:
      - echo Deploy Lambda function used in ASPNET application using AWS .NET Tools. Must be in path of Lambda function build 
      - cd $basePath
      - cd functions/NASAPicOfTheDay
      - dotnet lambda deploy-function NASAPicAPI -c Release -pac ../lambda_build/nasa-lambda-function.zip --profile-location funcprof -fd 'NASA API for Picture of the Day' -fn NASAPicAPI -fh NASAPicOfTheDay::NASAPicOfTheDay.SpacePic::GetNASAPicInfo -frun dotnetcore1.0 -frole arn:aws:iam::xxxxxxxxxxxx:role/lambda_exec_role -framework netcoreapp1.0 -fms 256 -ft 30  
      - echo Lambda function is now deployed - Now change directory back to Base path
      - cd $basePath
      - echo Restore started on `date`
      - dotnet restore AspNetCoreWebApplication/AspNetCoreWebApplication.csproj
  build:
    commands:
      - echo Build started on `date`
      - dotnet publish -c release -o ./build_output AspNetCoreWebApplication/AspNetCoreWebApplication.csproj
artifacts:
  files:
    - AspNetCoreWebApplication/build_output/**/*
    - scripts/**/*
    - appspec.yml
    

That’s it! All that is left is for me to add and commit all my file additions and updates to the AWS CodeCommit git repository provisioned for my ASPNetCore4Tara project. This kicks off the AWS CodePipeline for the project which will now use AWS CodeBuild new support for .NET Core to build and deploy both the ASP.NET Core web application and the .NET AWS Lambda function.

 

Summary

The support for .NET Core in AWS CodeStar and AWS CodeBuild opens the door for .NET developers to take advantage of the benefits of Continuous Integration and Delivery when building .NET based solutions on AWS.  Read more about .NET Core support in AWS CodeStar and AWS CodeBuild here or review product pages for AWS CodeStar and/or AWS CodeBuild for more information on using the services.

Enjoy building .NET projects more efficiently with Amazon Web Services using .NET Core with AWS CodeStar and AWS CodeBuild.

Tara

 

Amazon Polly – Announcing Speech Marks and Whispering

Like me, you may have loved going to the library or bookstore to have your favorite book narrated to you. As a child, I loved listening to books narrated by good storytellers who gave life to their stories by changing the inflection of their voice as needed. The book narration coupled with the visual aids the storytellers used to tell the story, drove my love for reading and exploring new books.

In fact, in order for my parents to ensure that my love of reading extended to classic novels, they bought my sister and I, a small projector device with a tape recorder. This device would narrate the story and synchronize the projection of the visuals from the book by using a chime sound to signal when we should advance to the next screen. While I have unfortunately dated myself with that story, it is great for me to look back and consider how far we have come with speech technologies like Text-to-Speech (TTS). Even with all of these advancements, it is still challenging for developers to add synchronized speech/voice to the animations of characters or graphics in their games, videos, and digital books using TTS. Additionally, it is very rare to successfully use a TTS solution to emulate the pitch, tempo, and level of loudness of the speech in lifelike voices.
With this in mind, I am happy to announce Amazon Polly is launching support for Speech Marks and Whispering.

Amazon Polly is a deep learning service that enables you to turn text into lifelike speech. You can select a voice of your choice by taking advantage of the 47 lifelike voices included in the service and its support for 24 languages. Using Polly, you can send the text you want to convert into speech to the Polly API, and it will return an audio stream that you can play or store it in common audio file formats like MP3.

Speech Marks are metadata, which allows developers to synchronize speech with visual experiences. This feature enables scenarios like lip-syncing by synchronizing speech with facial animations or using the highlighting of written words as they are spoken. The speech marks metadata describes the synthesized speech, and by using it alongside the speech audio stream can determine the beginning and ending of sounds, words, sentences, and SSML tags. With the new Speech Marks, developers can now create lip-syncing avatars, visually highlighted read-along experiences, and integrate speech capabilities into the gaming engines like Amazon Lumberyard to give a voice to the characters.

There are four types of speech marks:

  • Sentence: Specifies a sentence element in the input text
  • Word: Indicates a word element in the input text
  • Viseme: Illustrates the position of the face and mouth corresponding to the sound that is spoken
  • Speech Synthesis Markup Language (SSML): Describes a <mark> element from the SSML input text.

Whispering is a speech effect similar to pitch, tempo, and loudness, in that it provides developers with yet one more expressive voice feature with which they can now modify the Text-to-Speech output. The whispering feature allows developers to have words from their input text spoken in a whispered voice using <amazon:effect name=”whispered”> SSML element.

Let’s take a quick look at both these new features.

 

Using Speech Marks

I’ll jump into an example of using speech marks with Amazon Polly in the AWS Console. I’ll go first to the Amazon Polly console and press the Get started button.

I’m taken to the Text-To-Speech menu option, and I select the SSML tab under the Text-to-Speech section. I will simply add two sentences that I wish to be spoken in the provided text field and then select a Voice.

I’ll verify the sentences are in the form that I wish them to be spoken by clicking the Listen to Speech button. Since I like what I hear, I will proceed with adding the speech marks metadata. In order to use speech marks, I will select the Change file format link.

When the Change file format dialog box comes up, I will select the File Format option, Speech Marks, and under the Speech Mark Types section, I will choose: Word and Sentence, by checking the checkboxes beside each speech mark type. Now I will click the Change button.


This returns me to the Text-To-Speech section of the console, and I can now click the Download Speech Marks button to see the generated speech marks.

The file downloaded has a .marks extension and contains JSON, and contains information about the start and end of each of my sentences and words. The JSON fields are:

  • Time: timestamp in milliseconds from the beginning of the audio stream
  • Type: type of speech mark (sentence, word, viseme, or ssml)
  • Start: offset in bytes from the start of the object in the input text (not including viseme marks)
  • End: offset in bytes of the object’s end in the input text (not including viseme marks)
  • Value – data that varies based on the type of speech mark, i.e. sentence speech mark contains the entire sentence in the text

 

Using Whispering

As I noted previously, using the Whispering feature allows me to have my input text be spoken in a whispered voice using the SSML amazon:effect element with a name attribute value of whispered. I’ll use my example above and insert SSML elements to have some of my text spoken using a using a whispered voice.

I’ll return to the Amazon Polly console and in the text box change my current text to use the new whispered voice feature for the sentence, “My name is Tara”. To accomplish this I will use the following SSML element: <amazon:effect name=”whispered”>. Therefore, the final sentence with SSML marks I entered into the text box looks as follows:

<speak>Hi!<amazon:effect name="whispered">My name is Tara.</amazon:effect>I am excited to talk about Polly's new features.</speak>

When I click the Listen to speech button, I will hear that the sentence, “My name is Tara” is indeed spoken in a whispered voice.

I want to download my speech output, so I will click the Change file format link. When the Change file format dialog box comes up, I will select the MP3 option under File format section then click the Change button.


Now I have the option to download my file by clicking the Download MP3 button.

You can hear my speech output using the new whispered voice by clicking here.

Summary

The Speech Marks and Whispering features are available in Amazon Polly starting today. To learn more about these and other features visit the Amazon Polly developer guide found here: http://docs.aws.amazon.com/polly/latest/dg

For more information about Amazon Polly, visit the Amazon Polly product page or get started by converting your text to speech in the Amazon Polly console.

You should give your text the gift of voice with Amazon Polly today.

Tara

New- Introducing AWS CodeStar – Quickly Develop, Build, and Deploy Applications on AWS

It wasn’t too long ago that I was on a development team working toward completing a software project by a release deadline and facing the challenges most software teams face today in developing applications. Challenges such as new project environment setup, team member collaboration, and the day-to-day task of keeping track of the moving pieces of code, configuration, and libraries for each development build. Today, with companies’ need to innovate and get to market faster, it has become essential to make it easier and more efficient for development teams to create, build, and deploy software.

Unfortunately, many organizations face some key challenges in their quest for a more agile, dynamic software development process. The first challenge most new software projects face is the lengthy setup process that developers have to complete before they can start coding. This process may include setting up of IDEs, getting access to the appropriate code repositories, and/or identifying infrastructure needed for builds, tests, and production.

Collaboration is another challenge that most development teams may face. In order to provide a secure environment for all members of the project, teams have to frequently set up separate projects and tools for various team roles and needs. In addition, providing information to all stakeholders about updates on assignments, the progression of development, and reporting software issues can be time-consuming.

Finally, most companies desire to increase the speed of their software development and reduce the time to market by adopting best practices around continuous integration and continuous delivery. Implementing these agile development strategies may require companies to spend time in educating teams on methodologies and setting up resources for these new processes.

Now Presenting: AWS CodeStar

To help development teams ease the challenges of building software while helping to increase the pace of releasing applications and solutions, I am excited to introduce AWS CodeStar.

AWS CodeStar is a cloud service designed to make it easier to develop, build, and deploy applications on AWS by simplifying the setup of your entire development project. AWS CodeStar includes project templates for common development platforms to enable provisioning of projects and resources for coding, building, testing, deploying, and running your software project.

The key benefits of the AWS CodeStar service are:

  • Easily create new projects using templates for Amazon EC2, AWS Elastic Beanstalk, or AWS Lambda using five different programming languages; JavaScript, Java, Python, Ruby, and PHP. By selecting a template, the service will provision the underlying AWS services needed for your project and application.
  • Unified experience for access and security policies management for your entire software team. Projects are automatically configured with appropriate IAM access policies to ensure a secure application environment.
  • Pre-configured project management dashboard for tracking various activities, such as code commits, build results, deployment activity and more.
  • Running sample code to help you get up and running quickly enabling you to use your favorite IDEs, like Visual Studio, Eclipse, or any code editor that supports Git.
  • Automated configuration of a continuous delivery pipeline for each project using AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and AWS CodeDeploy.
  • Integration with Atlassian JIRA Software for issue management and tracking directly from the AWS CodeStar console

With AWS CodeStar, development teams can build an agile software development workflow that not only increases the speed in which teams can deploy software and bug fixes, but also enables developers to build software that is more inline with customers’ requests and needs.

An example of a responsive development workflow using AWS CodeStar is shown below:

Journey Into AWS CodeStar

Now that you know a little more about the AWS CodeStar service, let’s jump into using the service to set up a web application project. First, I’ll go to into the AWS CodeStar console and click the Start a project button.

If you have not setup the appropriate IAM permissions, AWS CodeStar will show a dialog box requesting permission to administer AWS resources on your behalf. I will click the Yes, grant permissions button to grant AWS CodeStar the appropriate permissions to other AWS resources.

However, I received a warning that I do not have administrative permissions to AWS CodeStar as I have not applied the correct policies to my IAM user. If you want to create projects in AWS CodeStar, you must apply the AWSCodeStarFullAccess managed policy to your IAM user or have an IAM administrative user with full permissions for all AWS services.

Now that I have added the aforementioned permissions in IAM, I can now use the service to create a project. To start, I simply click on the Create a new project button and I am taken to the hub of the AWS CodeStar service.

At this point, I am presented with over twenty different AWS CodeStar project templates to choose from in order to provision various environments for my software development needs. Each project template specifies the AWS Service used to deploy the project, the supported programming language, and a description of the type of development solution implemented. AWS CodeStar currently supports the following AWS Services: Amazon EC2, AWS Lambda, and AWS Elastic Beanstalk. Using preconfigured AWS CloudFormation templates, these project templates can create software development projects like microservices, Alexa skills, web applications, and more with a simple click of a button.

For my first AWS CodeStar project, I am going to build a serverless web application using Node.js and AWS Lambda using the Node.js/AWS Lambda project template.

You will notice for this template AWS CodeStar sets up all of the tools and services you need for a development project including an AWS CodePipeline connected with the services; AWS CodeBuild, AWS CloudFormation, and Amazon CloudWatch. I’ll name my new AWS CodeStar project, TaraWebProject, and click Create Project.

Since this is my first time creating an AWS CodeStar, I will see a dialog that asks about the setup of my AWS CodeStar user settings. I’ll type Tara in the textbox for the Display Name and add my email address in the Email textbox. This information is how I’ll appear to others in the project.

The next step is to select how I want to edit my project code. I have decided to edit my TaraWebProject project code using the Visual Studio IDE. With Visual Studio, it will be essential for me to configure it to use the AWS Toolkit for Visual Studio 2015 to access AWS resources while editing my project code. On this screen, I am also presented with the link to the AWS CodeCommit Git repository that AWS CodeStar configured for my project.

The provisioning and tool setup for my software development project is now complete. I’m presented with the AWS CodeStar dashboard for my software project, TaraWebProject, which allows me to manage the resources for the project. This includes the management of resources, such as code commits, team membership and wiki, continuous delivery pipeline, Jira issue tracking, project status and other applicable project resources.

What is really cool about AWS CodeStar for me is that it provides a working sample project from which I can start the development of my serverless web application. To view the sample of my new web application, I will go to the Application endpoints section of the dashboard and click the link provided.

A new browser window will open and will display the sample web application AWS CodeStar generated to help jumpstart my development. A cool feature of the sample application is that the background of the sample app changes colors based on the time of day.

Let’s now take a look at the code used to build the sample website. In order to view the code, I will back to my TaraWebProject dashboard in the AWS CodeStar console and select the Code option from the sidebar menu.

This takes me to the tarawebproject Git repository in the AWS CodeCommit console. From here, I can manually view the code for my web application, the commits made in the repo, the comparison of commits or branches, as well as, create triggers in response to my repo events.

This provides a great start for me to start developing my AWS hosted web application. Since I opted to integrate AWS CodeStar with Visual Studio, I can update my web application by using the IDE to make code changes that will be automatically included in the TaraWebProject every time I commit to the provisioned code repository.

You will notice that on the AWS CodeStar TaraWebProject dashboard, there is a message about connecting the tools to my project repository in order to work on the code. Even though I have already selected Visual Studio as my IDE of choice, let’s click on the Connect Tools button to review the steps to connecting to this IDE.

Again, I will see a screen that will allow me to choose which IDE: Visual Studio, Eclipse, or Command Line tool that I wish to use to edit my project code. It is important for me to note that I have the option to change my IDE choice at any time while working on my development project. Additionally, I can connect to my Git AWS CodeCommit repo via HTTPS and SSH. To retrieve the appropriate repository URL for each protocol, I only need to select the Code repository URL dropdown and select HTTPS or SSH and copy the resulting URL from the text field.

After selecting Visual Studio, CodeStar takes me to the steps needed in order to integrate with Visual Studio. This includes downloading the AWS Toolkit for Visual Studio, connecting the Team Explorer to AWS CodeStar via AWS CodeCommit, as well as, how to push changes to the repo.

After successfully connecting Visual Studio to my AWS CodeStar project, I return to the AWS CodeStar TaraWebProject dashboard to start managing the team members working on the web application with me. First, I will select the Setup your team tile so that I can go to the Project Team page.

On my TaraWebProject Project Team page, I’ll add a team member, Jeff, by selecting the Add team member button and clicking on the Select user dropdown. Team members must be IAM users in my account, so I’ll click on the Create new IAM user link to create an IAM accounts for Jeff.

When the Create IAM user dialog box comes up, I will enter an IAM user name, Display name, and Email Address for the team member, in this case, Jeff Barr. There are three types of project roles that Jeff can be granted, Owner, Contributor, or Viewer. For the TaraWebProject application, I will grant him the Contributor project role and allow him to have remote access by select the Remote access checkbox. Now I will create Jeff’s IAM user account by clicking the Create button.

This brings me to the IAM console to confirm the creation of the new IAM user. After reviewing the IAM user information and the permissions granted, I will click the Create user button to complete the creation of Jeff’s IAM user account for TaraWebProject.

After successfully creating Jeff’s account, it is important that I either send Jeff’s login credentials to him in email or download the credentials .csv file, as I will not be able to retrieve these credentials again. I would need to generate new credentials for Jeff if I leave this page without obtaining his current login credentials. Clicking the Close button returns me to the AWS CodeStar console.

Now I can complete adding Jeff as a team member in the TaraWebProject by selecting the JeffBarr-WebDev IAM role and clicking the Add button.

I’ve successfully added Jeff as a team member to my AWS CodeStar project, TaraWebProject enabling team collaboration in building the web application.

Another thing that I really enjoy about using the AWS CodeStar service is I can monitor all of my project activity right from my TaraWebProject dashboard. I can see the application activity, any recent code commits, and track the status of any project actions, such as the results of my build, any code changes, and the deployments from in one comprehensive dashboard. AWS CodeStar ties the dashboard into Amazon CloudWatch with the Application activity section, provides data about the build and deployment status in the Continuous Deployment section with AWS CodePipeline, and shows the latest Git code commit with AWS CodeCommit in the Commit history section.

Summary

In my journey of the AWS CodeStar service, I created a serverless web application that provisioned my entire development toolchain for coding, building, testing, and deployment for my TaraWebProject software project using AWS services. Amazingly, I have yet to scratch the surface of the benefits of using AWS CodeStar to manage day-to-day software development activities involved in releasing applications.

AWS CodeStar makes it easy for you to quickly develop, build, and deploy applications on AWS. AWS CodeStar provides a unified user interface, enabling you to easily manage your software development activities in one place. AWS CodeStar allows you to choose from various templates to setting up projects using AWS Lambda, Amazon EC2, or AWS Elastic Beanstalk. It comes pre-configured with a project management dashboard, an automated continuous delivery pipeline, and a Git code repository using AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and AWS CodeDeploy allowing developers to implement modern agile software development best practices. Each AWS CodeStar project gives developers a head start in development by providing working code samples that can be used with popular IDEs that support Git. Additionally, AWS CodeStar provides out of the box integration with Atlassian JIRA Software providing a project management and issue tracking system for your software team directly from the AWS CodeStar console.

You can get started using the AWS CodeStar service for developing new software projects on AWS today. Learn more by reviewing the AWS CodeStar product page and the AWS CodeStar user guide documentation.

Tara

Launch: Amazon Athena adds support for Querying Encrypted Data

In November of last year, we brought a service to the market that we hoped would be a major step toward helping those who have the need to securely access and examine massive amounts of data on a daily basis.  This service is none other than Amazon Athena which I think of as a managed service that is attempting “to leap tall queries in a single bound” with querying of object storage. A service that provides AWS customers the power to easily analyze and query large amounts of data stored in Amazon S3.

Amazon Athena is a serverless interactive query service that enables users to easily analyze data in Amazon S3 using standard SQL. At Athena’s core is Presto, a distributed SQL engine to run queries with ANSI SQL support and Apache Hive which allows Athena to work with popular data formats like CSV, JSON, ORC, Avro, and Parquet and adds common Data Definition Language (DDL) operations like create, drop, and alter tables. Athena enables the performant query access to datasets stored in Amazon Simple Storage Service (S3) with structured and unstructured data formats.

You can write Hive-compliant DDL statements and ANSI SQL statements in the Athena Query Editor from the AWS Management Console, from SQL clients such as SQL Workbench by downloading and taking advantage of the Athena JDBC driver. Additionally, by using the JDBC driver you can run queries programmatically from your desired BI tools. You can read more about the Amazon Athena service from Jeff’s blog post during the service release in November.

After releasing the initial features of the Amazon Athena service, the Athena team kept with the Amazon tradition of focusing on the customer by working diligently to make your customer experience with the service better. Therefore, the team has added a feature that I am excited to announce; Amazon Athena now provides support for Querying Encrypted data in Amazon S3. This new feature not only makes it possible for Athena to provide support for querying encrypted data in Amazon S3, but also enables the encryption of data from Athena’s query results. Businesses and customers who have requirements and/or regulations to encrypt sensitive data stored in Amazon S3 are able to take advantage of the serverless dynamic queries Athena offers with their encrypted data.

 

Supporting Encryption

Before we dive into the using the new Athena feature, let’s take some time to review the supported encryption options that S3 and Athena supports for customers needing to secure and encrypt data. Currently, S3 supports encrypting data with AWS Key Management Service (KMS). AWS KMS is a managed service for the creation and management of encryption keys used to encrypt data. In addition, S3 supports customers using their own encryption keys to encrypt data. Since it is important to understand the encrypted options that Athena supports for datasets stored on S3, in the chart below I have provided a breakdown of the encryption options supported with S3 and Athena, as well as, noted when the new Athena table property, has_encrypted_data, is required for encrypted data access.

 

For more information on Amazon S3 encryption with AWS KMS or Amazon S3 Encryption options, review the information in the AWS KMS Developer Guide on How Amazon Simple Storage Service (Amazon S3) Uses AWS KMS and Amazon S3 Developer Guide on Protecting Data Using Encryption respectively.

 

Creating & Accessing Encrypted Databases and Tables

As I noted before, there are a couple of ways to access Athena. Of course, you can access Athena through the AWS Management Console, but you also have the option to use the JDBC driver with SQL clients like SQL Workbench and other Business Intelligence tools. In addition, the JDBC driver allows for programmatic query access.

Enough discussion, it is time to dig into this new Athena service feature by creating a database and some tables, running queries from the table and encryption of the query results. We’ll accomplish all this by using encrypted data stored in Amazon S3.

If this is your first time logging into the service, you will see the Amazon Athena Getting Started screen as shown below. You would need to click the Get Started button to be taken the Athena Query Editor.

Now that we are in the Athena Query Editor, let’s create a database. If the sample database is shown when you open your Query Editor you would simply start typing your query statement in the Query Editor window to clear the sample query and create the new database.

I will now issue the Hive DDL Command, CREATE DATABASE <dbname> within the Query Editor window to create my database, tara_customer_db.

Once I receive the confirmation that my query execution was successful in the Results tab of Query Editor, my database should be created and available for selection in the dropdown.

I now will change my selected database in the dropdown to my newly created database, tara_customer_db.

 

 

With my database created, I am able to create tables from my data stored in S3. Since I did not have data encrypted with the various encryption types, the product group was kind enough to give me some sample data files to place in my S3 buckets. The first batch of sample data that I received was encrypted with SSE-KMS which if you recall from the encryption table matrix we discussed above is encryption type, Server-Side Encryption with AWS KMS–Managed Keys. I stored this set of encrypted data in my S3 bucket aptly named: aws-blog-tew-posts/SSE_KMS_EncryptionData. The second batch of sample data was encrypted with CSE-KMS, which is the encryption type, Client-Side Encryption with AWS, and is stored in my aws-blog-tew-posts/ CSE_KMS_EncryptionData S3 bucket. The last batch of data I received is just good old-fashioned plain text, and I have stored this data in the S3 bucket, aws-blog-tew-posts/PlainText_Table.

Remember to access my data in the S3 buckets from the Athena service, I must ensure that my data buckets have the correct permissions to allow Athena access each bucket and data contained therein. In addition, working with AWS KMS encrypted data requires users to have roles that include the appropriate KMS key policies. It is important to note that to successfully read KMS encrypted data, users must have the correct permissions for access to S3, Athena, and KMS collectively.

There are several ways that I can provide the appropriate access permissions between S3 and the Athena service:

  1. Allow access via user policy
  2. Allow access via bucket policy
  3. Allow access with both a bucket policy and user policy.

To learn more about the Amazon Athena access permissions and/or the Amazon S3 permissions by reviewing the Athena documentation on Setting User and Amazon S3 Bucket Permissions.

Since my data is ready and setup in my S3 buckets, I just need to head over to Athena Query Editor and create my first new table from the SSE-KMS encrypted data. My DDL commands that I will use to create my new table, sse_customerinfo, is as follows:

CREATE EXTERNAL TABLE sse_customerinfo( 
  c_custkey INT, 
  c_name STRING, 
  c_address STRING, 
  c_nationkey INT, 
  c_phone STRING, 
  c_acctbal DOUBLE, 
  c_mktsegment STRING, 
  c_comment STRING
  ) 
ROW FORMAT SERDE  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
STORED AS INPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
OUTPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' 
LOCATION  's3://aws-blog-tew-posts/SSE_KMS_EncryptionData';

I will enter my DDL command statement for the sse_customerinfo table creation into my Athena Query Editor and click the Run Query button. The Results tab will note that query was run successfully and you will see my new table show up under the tables available for the tara_customer_db database.

I will repeat this process to create my cse_customerinfo table from the CSE-KMS encrypted batch of data and then the plain_customerinfo table from the unencrypted data source stored in my S3 bucket. The DDL statements used to create my cse_customerinfo table are as follows:

CREATE EXTERNAL TABLE cse_customerinfo (
  c_custkey INT, 
  c_name STRING, 
  c_address STRING, 
  c_nationkey INT, 
  c_phone STRING, 
  c_acctbal DOUBLE, 
  c_mktsegment STRING, 
  c_comment STRING
)
ROW FORMAT SERDE   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
STORED AS INPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
OUTPUTFORMAT  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION   's3://aws-blog-tew-posts/CSE_KMS_EncryptionData'
TBLPROPERTIES ('has_encrypted_data'='true');

Again, I will enter my DDL statements above into the Athena Query Editor and click the Run Query button. If you review the DDL statements used to create the cse_customerinfo table carefully, you will notice a new table property (TBLPROPERTIES) flag, has_encrypted_data, was introduced with the new Athena encryption capability. This flag is used to tell Athena that the data in S3 to be used with queries for the specified table is encrypted data. If take a moment and refer back to the encryption matrix table we I reviewed earlier for the Athena and S3 encryption options, you will see that this flag is only required when you are using the Client-Side Encryption with AWS KMS–Managed Keys option. Once the cse_customerinfo table has been successfully created, a key symbol will appear next to the table identifying the table as an encrypted data table.

Finally, I will create the last table, plain_customerinfo, from our sample data. Same steps as we performed for the previous tables. The DDL commands for this table are:

CREATE EXTERNAL TABLE plain_customerinfo(
  c_custkey INT, 
  c_name STRING, 
  c_address STRING, 
  c_nationkey INT, 
  c_phone STRING, 
  c_acctbal DOUBLE, 
  c_mktsegment STRING, 
  c_comment STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION 's3://aws-blog-tew-posts/PlainText_Table';


Great! We have successfully read encrypted data from S3 with Athena, and created tables based on the encrypted data. I can now run queries against my newly created encrypted data tables.

 

Running Queries

Running Queries against our new database tables is very simple. Again, common DDL statements and commands can be used to create queries against your data stored in Amazon S3. For our query review, I am going to use Athena’s preview data feature. In the list of tables, you will see two icons beside the tables. One icon is a table property icon, selecting this will bring up the selected table properties, however, the other icon, displayed as an eye symbol, and is the preview data feature that will generate a simple SELECT query statement for the table.

 

 

To demonstrate running queries with Athena, I have selected to preview data for my plain_customerinfo by selecting the eye symbol/icon next to the table. The preview data feature creates the following DDL statement:

SELECT * FROM plain_customerinfo limit 10;

The query results from using the preview data feature with my plain_customerinfo table are displayed in the Results tab of the Athena Query Editor and provides the option to download the query results by clicking the file icon.

The new Athena encrypted data feature also supports encrypting query results and storing these results in Amazon S3. To take advantage of this feature with my query results, I will now encrypt and save my query data in a bucket of my choice. You should note that the data table that I have selected is currently unencrypted.
First, I’ll select the Athena Settings menu and the review the current storage settings for my query results. Since I do not have a KMS key to use for encryption, I will select the Create KMS key hyperlink and create a KMS key for use in encrypting my query results with Athena and S3. For details on how to create a KMS key and configure the appropriate user permissions, please see http://docs.aws.amazon.com/kms/latest/developerguide/create-keys.html.

After successfully creating my s3encryptathena KMS key and copying the key ARN for use in my Athena settings, I return to the Athena console Settings dialog and select the Encrypt query results textbox. I, then update the Query result location textbox point to my s3 bucket, aws-athena-encrypted, which will be the location for storing my encrypted query results.

The only thing that is left is to select the Encryption type and enter my KMS key. I can do this by either selecting the s3encryptathena key from the Encryption key dropdown or enter its ARN in the KMS key ARN textbox. In this example, I have chosen to use SSE-KMS for the encryption type. You can see both examples of selecting the KMS key below. Clicking the Save button completes the process.

Now I will rerun my current query for my plain_customerinfo table. Remember this table is not encrypted, but with the Athena settings changes made for adding encryption for the query results, I have enabled the query results run against this table to be stored with SSE-KMS encryption using my KMS key.

After my query rerun, I can see the fruits of my labor by going to the Amazon S3 console and viewing the CSV data files saved in my designated bucket, aws-athena-encrypted, and the SSE-KMS encryption of the bucket and files.

 

Summary

Needless to say, this Athena launch has several benefits for those needing to secure data via encryption while still retaining the ability to perform queries and analytics for data stored in varying data formats. Additionally, this release includes improvements I did not dive into with this blog post.

  • A new version of the JDBC driver that supports new encryption feature and key updates.
  • Added the ability to add, replace, and change columns using ALTER TABLE.
  • Added support for querying LZO-compressed data.

See the release documentation in the Athena user guide to more details and start leveraging Athena to query your encrypted data stored in Amazon S3 now, by reviewing the Configuring Encryption Options section in the Athena documentation.

Learn more about Athena and serverless queries on Amazon S3 by visiting the Athena product page or reviewing the Athena User Guide. In addition, you can dig deeper on the functionality of Athena and data encryption with S3 by reviewing the AWS Big Data Blog post: Analyzing Data in S3 using Amazon Athena and the AWS KMS Developer Guide.

Happy Encrypting!

Tara

Amazon CloudWatch launches Alarms on Dashboards

Amazon CloudWatch is a service that gives customers the ability to monitor their applications, systems, and solutions running on Amazon Web Services by providing and collecting metrics, logs, and events about AWS resources in real time. CloudWatch automatically provides key resource measurements such as; latency, error rates, and CPU usage, while also enabling monitoring of custom metrics via customer-supplied logs and system data.

Last November, Amazon CloudWatch added new Dashboard Widgets to provide additional data visualization options for all available metrics. In order to provide customers with even more insight into their solutions and resources running on AWS, CloudWatch has launched Alarms on Dashboards. With this alarms enhancement, customers can view alarms and metrics in the same dashboard widget enabling them to perform data-driven troubleshooting and analysis.

CloudWatch dashboards are designed with a goal of providing better visibility when monitoring AWS resources across regions in a consolidated view. Since CloudWatch dashboards are highly customizable, users can create their own custom dashboards to graphically represent data for varying metrics such as utilization, performance, estimated billing, and now alarm conditions. An alarm tracks a single metric over time based on the value of the metric in relation to a specified threshold. When the alarm state changes, an action such an Auto Scaling policy is executed or a notification is sent to Amazon SNS, among other options.

With the ability to add alarms to dashboards, CloudWatch users have another mechanism to proactively monitor and receive alerts about their AWS resources and applications across multiple regions. In addition, the metric data associated with an alarm, which has been added to a dashboard, can be charted and reviewed. Alarms have three possible states:

  • OK: The value of the alarm metric does not meet the threshold
  • INSUFFICIENT DATA: Initial triggering of alarm metric or alarm metric data does not have enough data to determine whether it’s in the OK state or the ALARM state
  • ALARM: The value of the alarm metric meets the threshold

When added to a dashboard, alarms are displayed in red when in the Alarm state, gray when in the Insufficient data state and shown with no color fill when the alarm is in the OK state. Alarms added to a dashboard are supported with the following widgets: Line, Number, and Stacked Graph widgets.

  • Number widget: provides a quick and efficient view of the latest value of any desired metric. Using the widget with alarms, the view of the state of the alarm is shown with different background colors for the latest metric data.
  • Line widget: allows the visualization of the actual value of any collection of chosen metrics. Provides a view on the dashboard of the state of the alarm, which displays the alarm threshold and condition as a horizontal line. The threshold line can act as a good indicator to view the degree of the alarm.
  • Stack graph widget: allows customers to visualize the net total effect of any collection of chosen metrics. The stacked graph widget loads one metric over another in order to illustrate the distribution and contribution of a metric and has the option to display the contribution of metrics in percentages. With alarms, it also provides a view of the state of the alarm, which displays the alarm threshold and condition as a horizontal line.

Currently, adding multiple metrics onto the same widget for an alarm is in the works and this feature is evolving based on customer feedback.

Adding Alarms on Dashboards

Let’s take a quick look at the utilizing the Alarms on a CloudWatch Dashboard. In the AWS Console, I will go to the CloudWatch service. When in the CloudWatch console, select Dashboards. I will click the Create dashboard button and create the CloudWatchBlog dashboard.

 

Upon creation of my CloudWatchBlog dashboard, a dialog box will open to allow me to add widgets to the dashboard. I will forego adding widgets for now since I want to focus on adding alarms on my dashboard. Therefore, I will hit the Cancel button here and go to the Alarms section of the CloudWatch console.

Once in the Alarms section of the CloudWatch console, you will see all of your alarms and the state of each of the alarms for the current region displayed.

As we mentioned earlier, there are three types of alarm states and as you can see in my console above that all of the different alarms states for various alarms are being displayed. If desired, you can adjust your filter on the console to display alarms filtered by the alarm state type.

As an example, I am only interested in viewing the alarms with an alarm state of ALARM. Therefore, I will adjust the filter to show only the alarms in the current region with an alarm state as ALARM.

Now only the two alarms that have a current alarm state of ALARM are displayed. One of these alarms is for monitoring the provisioned write capacity units of an Amazon DynamoDB table, and the other is to monitor the CPU utilization of my active Amazon Elasticsearch instance.

Let’s examine the scenario in which I leverage my CloudWatchBlog dashboard as my troubleshooting mechanism for identifying and diagnosing issues with my Elasticsearch solution and its instances. I will first add the Amazon Elasticsearch CPU utilization alarm, ES Alarm, to my CloudWatchBlog dashboard. To add the alarm, I simply select the checkbox by the desired alarm, which in this case is ES Alarm. Then with the alarm selected, I click the Add to Dashboard button.

The Add to dashboard dialog box will open, allowing me to select my CloudWatchBlog dashboard. Additionally, I can select the widget type I would like to use for the display of my alarm. For the ES Alarm, I will choose the Line widget and complete the process of adding this alarm to my dashboard by clicking the Add to dashboard button.

Upon successfully adding ES Alarm to the CloudWatchBlog dashboard, you will see a confirmation notice displayed in the CloudWatch console.

If I then go to the Dashboard section of the console and select my CloudWatchBlog dashboard, I will see the line widget for my alarm, ES Alarm, on the dashboard. To ensure that my ES Alarm widget is a permanent part of the dashboard, I will click the Save dashboard button to preserve the addition of this widget on the dashboard.

As we discussed, one of the benefits of utilizing a CloudWatch dashboard is the ability to add several alarms from various regions onto a dashboard. Since my scenario is leveraging my dashboard as a troubleshooting mechanism for my Elasticsearch solution, I would like to have several alarms and metrics related to my solution displayed on the CloudWatchBlog dashboard. Given this, I will create another alarm for my Elasticsearch instance and add it to my dashboard.

I will first return to the Alarms section of the console and click the Create Alarm button.

The Create Alarm dialog box is displayed showing all of the current metrics available in this region. From the summary, I can quickly see that there are 21 metrics being tracked for Elasticsearch. I will click on the ES Metrics link to view the individual metrics that can be used to create my alarm.

I can review the individual metrics shown for my Elasticsearch instance, and choose which metric I want to base my new alarm on. In this case, I choose the WriteLatency metric by selecting the checkbox for this metric and then click the Next button.

 

The next screen is where I fill in all the details about my alarm: name, description, alarm threshold, time period, and alarm action. I will name my new alarm, ES Latency Alarm, and complete the rest of the aforementioned data fields. To complete the creation of my new alarm, I click the Create Alarm button.

I will see a confirmation message box at the top of the Alarms console upon successful completion of adding the alarm, and the status of the newly created alarm will be displayed in the alarms list.

Now I will add my ES Latency Alarm to my CloudWatchBlog dashboard. Again, I click on the checkbox by the alarm and then click the Add to Dashboard button.

This time when the Add to Dashboard dialog comes up, I will choose the Stacked area widget to display the ES Latency Alarm on my CloudWatchBlog dashboard. Clicking the Add to Dashboard button will complete the addition of my ES Latency Alarm widget to the dashboard.

Once back in the console, again I will see the confirmation noting the successful addition of the widget. I go to the Dashboards and click on the CloudWatchBlog dashboard and I can now view the two widgets in my dashboard. To include this widget in the dashboard permanently, I click the Save dashboard button.

The final thing to note about the new CloudWatch feature, Alarms on Dashboards, is that alarms and metrics from other regions can be added to the dashboard for a complete view for troubleshooting. Let’s add a metric to the dashboard with the alarms widget.

Within the console, I will move from my current region, US East (Ohio), to the US East (N. Virginia) region.

Now I will go to the Metric section of the CloudWatch console. This section displays the metrics from services used in the US East (N. Virginia) region.

My Elasticsearch solution triggers Lambda functions to capture all of the EmployeeInfo DynamoDB database CRUD (Create, Read, Update, Delete) changes via DynamoDB streams and write those changes into my Elasticsearch domain, taratestdomain. Therefore, I will add metrics to my CloudWatchBlog dashboard to track table metrics from DynamoDB.

Therefore, I am going to add the EmployeeInfo database ProvisionedWriteCapacityUnits metric to my CloudWatchBlog dashboard.

Back again in the Add to Dashboard dialog, I will select my CloudWatchBlog dashboard and choose to display this metric using the Number widget.

Now, the ProvisionedWriteCapacityUnits metric from the US East (N. Virginia) is displayed in the CloudWatchBlog dashboard with the Number widget added to the dashboard to with the alarms from the US East (Ohio). To make this update permanent in the dashboard, I will (you guessed it!) click the Save dashboard button.

Summary

Getting started with alarms on dashboards is easy. You can use alarms on dashboards across regions for another means of proactively monitoring alarms, build troubleshooting playbooks, and view desired metrics. You can also choose the metric first in the Metric UI and then change the type of widget according to the visualization that fits the metric.

Alarms on Dashboards are supported on Line, Stacked Area, and Number widgets. In addition, you can use Text widgets next to alarms on a dashboard to add steps or observations on how to handle changes in the alarm state. To learn more about Amazon CloudWatch widgets and about the additional dashboard widgets, visit the Amazon CloudWatch documentation and the CloudWatch Getting Started guide.

 

Tara

New – Amazon EMR Instance Fleets

Today we’re excited to introduce a new feature for Amazon EMR clusters called instance fleets. Instance fleets gives you a wider variety of options and intelligence around instance provisioning. You can now provide a list of up to 5 instance types with corresponding weighted capacities and spot bid prices (including spot blocks)! EMR will automatically provision On-Demand and Spot capacity across these instance types when creating your cluster. This can make it easier and more cost effective to quickly obtain and maintain your desired capacity for your clusters.

You can also specify a list of Availability Zones and EMR will optimally launch your cluster in one of the AZs. EMR will also continue to rebalance your cluster in the case of Spot instance interruptions by replacing instances with any of the available types in your fleet. This will make it easier to maintain your cluster’s overall capacity. Instance fleets can be used instead of instance groups. Just like groups your cluster will have master, core, and task fleets.

Let’s take a look at the console updates to get an idea of how these fleets work.

We’ll start by navigating to the EMR console and clicking the Create Cluster button. That should bring us to our familiar EMR provisioning console where we can navigate to the advanced options near the top left.

We’ll select the latest EMR version (instance fleets are available for EMR versions 4.8.0 and greater, with the exception of 5.0.x) and click next.

Now we get to the good stuff! We’ll select the new instance fleet option in the hardware options.
Screenshot 2017-03-09 00.30.51

Now what I want to do is modify our core group to have a couple of instance types that will satisfy the needs of my cluster.

CoreFleetScreenshot

EMR will provision capacity in each instance fleet and availability zone to meet my requirements in the most cost effective way possible. The EMR console provides an easy mapping of vCPU to weighted capacity for each instance type, making it easy to use vCPU as the capacity unit (I want 16 total vCPUs in my core fleet). If the vCPU units don’t match my criteria for weighting instance types I can change the “Target capacity” selector to include arbitrary units and define my own weights (this is how the API/CLI consume capacity units as well).

When the cluster is being provisioned if it’s unable to obtain the desired spot capacity within a user defined timeout you can have it terminate or fall back onto On-Demand instances to provision the rest of the capacity.

All this functionality for instance fleets is also available from the AWS SDKs and the CLI. Let’s take a look at how we would provision our own instance fleet.

First we’ll create our configuration json in my-fleet-config.json:

[
  {
    "Name": "MasterFleet",
    "InstanceFleetType": "MASTER",
    "TargetOnDemandCapacity": 1,
    "InstanceTypeConfigs": [{"InstanceType": "m3.xlarge"}]
  },
  {
    "Name": "CoreFleet",
    "InstanceFleetType": "CORE",
    "TargetSpotCapacity": 11,
    "TargetOnDemandCapacity": 11,
    "LaunchSpecifications": {
      "SpotSpecification": {
        "TimeoutDurationMinutes": 20,
        "TimeoutAction": "SWITCH_TO_ON_DEMAND"
      }
    },
    "InstanceTypeConfigs": [
      {
        "InstanceType": "r4.xlarge",
        "BidPriceAsPercentageOfOnDemandPrice": 50,
        "WeightedCapacity": 1
      },
      {
        "InstanceType": "r4.2xlarge",
        "BidPriceAsPercentageOfOnDemandPrice": 50,
        "WeightedCapacity": 2
      },
      {
        "InstanceType": "r4.4xlarge",
        "BidPriceAsPercentageOfOnDemandPrice": 50,
        "WeightedCapacity": 4
      }
    ]
  }
]

Now that we have our configuration we can use the AWS CLI’s ’emr’ subcommand to create a new cluster with that configuration:

aws emr create-cluster --release-label emr-5.4.0 \
--applications Name=Spark,Name=Hive,Name=Zeppelin \
--service-role EMR_DefaultRole \
--ec2-attributes InstanceProfile="EMR_EC2_DefaultRole,SubnetIds=[subnet-1143da3c,subnet-2e27c012]" \
--instance-fleets file://my-fleet-config.json

If you’re eager to get started the feature is available now at no additional cost and you can find detailed documentation to help you get started here.

Thanks to the EMR service team for their help writing this post!

Randall Hunt

Launch: Amazon ElastiCache Launches Enhanced Redis Backup and Restore with Cluster Resizing

Most of us equate in-memory caching with improved performance and lower cost at scale when designing applications or building solutions. Now if there was only a service that would continually make it simpler to deploy and utilize in-memory cache in the cloud while increasing the ability to scale.

Okay no more joking around, the cloud service that provides this great functionality is, of course, Amazon ElastiCache. Amazon ElastiCache is an AWS managed service that provides a performant in-memory data store or cache in the cloud while offering a straightforward way to create, scale, and manage a distributed environment for low-latency, secure, access of I/O intensive or compute heavy data. Additionally, ElastiCache reduces the overhead of managing infrastructure for your in-memory data structure server or cache by detecting and replacing failed nodes while providing enhanced visibility into key performance metrics of the caching system nodes via Amazon CloudWatch. This exciting service is now launching support for Enhanced Redis Backup and Restore with Cluster Resizing.

For those of you familiar with Amazon ElastiCache, you are likely aware that ElastiCache currently supports two in-memory key-value engines:

  • Memcached: an open source, high-performing, distributed memory object caching system developed in 2003 with the initial goal of speeding up dynamic web applications by alleviating database load
  • Redis: an open source in-memory data structure store launched in 2009 developed as a broker for caching, messaging, and databases with built-in replication, atomic operation support, various levels of on-disk persistence, and high availability via Redis Cluster.

In October of 2016, support was added for Redis Cluster with Redis 3.2.4. This allowed ElastiCache Redis users to, not only take advantage of Redis clusters, but also gave users the ability to:

  • Create cluster-level backups.
  • Produce snapshots of each of the cluster’s shards contained within backups.
  • Scale their workloads with 3.5TiB of data across up to 15 shards.

You can read more about using Redis with ElastiCache and the related features by reviewing the product page for Amazon ElastiCache for Redis.

With the launch of the Enhanced Backup and Restore with Cluster Resizing feature, ElastiCache is providing even deeper support for Redis with a clear-cut migration path to a managed Redis Cluster experience. There are several benefits of this enhancement for ElastiCache and Redis users alike, such as:

  • Ability to restore backup into a Redis Cluster with a different number of shards and slot distribution
  • Deliver the capability for users to resize Redis workloads
  • Allow Redis database file (RDB) snapshots as input for creating a sharded Redis Cluster
  • Offer option to use snapshot(s) of Redis on EC2 implementations (both Redis Cluster and single-shard Redis) as data input for sharded Redis Cluster creation

To accomplish these tasks, ElastiCache will parse the Redis key space across the backup’s individual snapshots, and redistribute the keys in the new Cluster according to the requested number of shards and hash slots. You would simply take your RDB snapshots and store them on S3, then provide ElastiCache with the desired number of shards and the snapshot file. ElastiCache handles the heavy lifting of restoring the Redis data store into a Redis cluster.

I am sure that you all may be thinking; Is it really that easy to leverage the Enhanced Redis Backup and Restore with Cluster Resizing feature in ElastiCache? Well, there is no time like the present to find out. Let’s take a trip to the AWS Management Console, and put this newly launched enhancement in action by restoring an external RDB snapshot to a new cluster using ElastiCache.

My first stop in the AWS Management console is to the Amazon S3 console. I have some Redis .rdb snapshot files I received from some of my peers here at AWS in order to test the restore of an external Redis snapshot to ElastiCache. I will need to put these files into Amazon S3 so that I can access the snapshots as input for my ElastiCache Redis cluster.

In the S3 console, I will go to my S3 bucket, aws-blog-tew-posts, that I created for testing and development purposes. I’ll upload the .rdb snapshot files that were provided to me into this S3 bucket.

 

It is important to note that the name of your S3 bucket must conform to DNS standards. To be DNS-compliant, the name must be at least three characters, must contain only lowercase letters, numbers, and/or dashes, and it must start and end with a lowercase letter or number. While this may be obvious, I will also note that the bucket name cannot be in an IP address format. You can learn more about the S3 Bucket Restrictions at the link provided here.

With my .rdb files successfully uploaded into my aws-blog-tew-posts bucket, I need to take note of the S3 path to these backup files. For these files, the path would be aws-blog-tew-posts/dump_1.rdb or aws-blog-tew-posts/dump_10.rdb. If you have placed your files into a folder, the folder name would need to be included in this path, i.e. thebucketname/thefoldername/thefilename.

For ElastiCache to access these files, I need to ensure that the service has read permissions for each of the files. To provide access, I will update the permissions for each of .rdb files by assigning the Grantee as the canonical id for my region and grant the user Open/Download permissions. The canonical id for all regions, outside of China (Beijing) and AWS GovCloud (US), is as follows:

540804c33a284a299d2547575ce1010f2312ef3da9b3a053c8bc45bf233e4353

After I click the Save button, I am all set to use these files as input for an ElastiCache Redis cluster.

The next step is to go to the ElastiCache console. Here I will create a new ElastiCache Redis cluster and seed this new cluster with data from one of the RDB snapshots located in the files in my S3 bucket. I’ll choose the dump_1.rdb snapshot file to use as my data input to seed this new cluster. Since I want to explore the ElastiCache Redis capabilities added on this past October with 3.2.4 support of Redis Cluster, as well as, discuss the new Backup and Restore with Cluster Resizing enhancements, I’ll create a new Redis Cluster and ensure I have cluster mode enabled. At this point, I should note that you cannot restore from a backup created using a Redis (cluster mode enabled) cluster to a Redis (cluster mode disabled) cluster.

First, I will click the Get Started Now button from the ElastiCache console dashboard or the Create button based upon your console view.

In the Create your Amazon ElastiCache cluster dialog window, I’ll select Redis for my caching and make sure I click the checkbox for Cluster Mode enabled (Scale Out). The name of my new cluster will be, tew-rediscluster and I since I am enabling a Cluster mode, my ElastiCache Redis Engine version is 3.2.4. For this cluster, I will keep the default Redis port of 6379.

The key benefit of the ElastiCache enhanced Redis Backup and Restore feature is the cluster resizing capability that allows me to build a new cluster with a different number of shards than was originally used for the backup file. To build the new Redis Cluster, I am using only one RDB snapshot file, dump_1.rdb which is a small Redis instance backup with only one shard. However, in the creation of my new tew-rediscluster, I have opted for 3 shards with 2 replicas per shard.

In addition, I have the ability to specify a node type for my new cluster that is a different size than my original instance from the RDB snapshot. As I mentioned, the dump_1.rdb is a backup of a Redis instance that is significantly smaller than the size of the chosen node type for my tew-rediscluster shown below.

There are other options and data input needed in order to complete the creation of my ElastiCache Redis cluster that I will not show in this blog post. However, if you want to go through each of the steps necessary for creating an ElastiCache Redis cluster you can find more information in the AWS ElastiCache Getting Started documentation for Launch a Cluster.

Once I have provided all the information needed to create my ElastiCache Redis cluster, I will need to tell ElastiCache how to seed the cluster with the .rdb file by providing the file location from my S3 bucket. In the Import Data to Cluster section of the create dialog, I will enter the S3 path to my dump_1.rdb in the Seed RDB file S3 location textbox. Remember, the nomenclature for the S3 file path is Bucket/Folder/ObjectName so I will enter aws-blog-tew-posts/dump_1.rdb as the path to the RDB file in S3. All that is left now is to click the Create button.

 

That’s it! ElastiCache goes to work to creating the new Redis cluster. After a short time period, the ElastiCache console shows my new Amazon ElastiCache Redis cluster as available and I have successfully created this cluster with data restored from an external RDB snapshot file.

 

I just demonstrated how you have the capability to create an ElastiCache Redis cluster using an external RDB snapshot, but of course, you can create backups and restore from backups from your existing ElastiCache Redis clusters as well. To dig deeper into information about this newly launched feature, visit Restoring From a Backup with Cluster Resizing in the Amazon ElastiCache User Guide.

To learn more about making your applications more performant with Amazon ElastiCache, visit the AWS Amazon ElastiCache page for product details, resources, and customer testimonials.

– Tara