Category: AWS Lambda


New – CloudWatch Events for EBS Snapshots

by Jeff Barr | on | in Amazon CloudWatch, Amazon Elastic Block Store, AWS Lambda, Launch | | Comments

Cloud computing can improve upon traditional IT operations by giving you the power to automate complex high-level operations that were formerly kept in a runbook or passed along as tribal knowledge. Far too many of these operations involve backup and recovery operations, especially in smaller and less mature organizations.

Many AWS customers make great use of Amazon Elastic Block Store (EBS) volumes, especially given the ease with which they can generate and manage snapshot backups. They are also copying snapshots between regions on a regular basis for disaster recovery and other operational reasons.

Today we are bringing the benefits of automation to EBS with the addition of new CloudWatch Events for EBS snapshots. You can use these events to add additional automation to your cloud-based backup environment. Here are the new events:

  • createSnapshot – Fired after the status of a newly created EBS snapshot changes to Complete.
  • copySnapshot – Fired after the status of a snapshot copy changes to Complete.
  • shareSnapshot – Fired after a snapshot is shared with your AWS account.

A lot of AWS customers monitor the status of their snapshots by making repeated calls to the DescribeSnapshots function and then stepping through the paginated output in order to locate a specific snapshot. These new events open the door to all sorts of event-driven automation, including the cross-region copy that I mentioned earlier.

Using Snapshot Events
In order to get a better understanding of how this feature helps to automate data backup workflows, I’ll create a workflow that copies a completed snapshot to another region. First, I’ll create an IAM policy that grants appropriate permissions. Then I will incorporate an AWS Lambda function (created by my colleagues) that takes action on the createSnapshot event. Finally, I’ll create a CloudWatch Events rule to capture the event and route it to the Lambda function.

I start out by creating an IAM role (CopySnapshotToRegion) with this policy:

Then I created a new Lambda function (you can find the code at Amazon CloudWatch Events for EBS):

Next, I hopped over to the CloudWatch Events Console, clicked on Create rule, and set it up to handle successful createSnapshot events:

And gave it a name:

To test it out, I create a new EBS snapshot in my source region:

The function was invoked as expected and the snapshot was copied to the target region within seconds (in practice, the copy time will depend on the size of the snapshot):

You can also use these events to make copies of snapshots that are shared with you from other accounts. Many AWS customers partition their usage across multiple accounts for various organizational and security reasons; take a look at our AWS Multiple Account Security Strategy to see our in-depth recommendations in this area. Here are two of the five models included therein:

Available Now
The new events are available  in the US East (Northern Virginia), US East (Ohio), US West (Northern California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), EU (Frankfurt), EU (Ireland), and South America (São Paulo) Regions and you can start using them today! Take a look and let me know what you come up with.

Jeff;

 

PS – If you are a developer, development manager, or a product manager and would like to build systems like this, check out the EBS Jobs page.

 

Genome Engineering Applications: Early Adopters of the Cloud

by Jeff Barr | on | in Amazon DynamoDB, Amazon S3, AWS Lambda, Customer Success | | Comments

Our friends at the Commonwealth Scientific and Industrial Research Organization (CSIRO) in Australia sent along the guest post below to tell us about how AWS powers an important new genome editing technique.

— Jeff


 

Recent developments in molecular engineering technology now enables the accurate editing of genomes. The new technology, called CRISPR-Cas9, can be programmed to recognize and edit specific locations in the genome by pattern-matching unique sequences of DNA. While this is a powerful new tool for researchers, the ability to scan and identify targets across the entire genome has created unprecedented demand for large-scale computation. Earlier this year, the US National Institutes of Health (NIH) has approved the use of these technologies for human health. This has the potential to revolutionize cancer treatments and also adds a new time-critical dimension to the compute requirements.

A New Approach to Cancer Treatments
Approximately two in five people will be diagnosed with cancer at some point during their lifetime and while overall cancer survival has doubled, there are still cancer types with very low survival rate, for example just 1% for pancreatic cancer. This is mainly due to the difficulty of finding therapeutic interventions that kill cancer cells but not harm the healthy tissue in the body.

The new NIH approved trial will leverage breakthroughs in the genome editing technology, CRISPR-Cas9, to develop a different treatment approach. In this, the patient’s own immune system is boosted through specific modifications of the cells that natively fight cancer. This has the potential of being effective for a wide range of different tumors, with the current trial including patients with specific blood and solid cancers, as well as melanoma.

Cloud Services for Computationally Guided Genome Engineering
This new application in human health requires an increase in robustness and efficiency of CRISPR-Cas9 design in order to meet the time constraints of clinical care. Built on AWS cloud-services, researchers in the eHealth program of the Commonwealth Scientific and Industrial Research Organization (CSIRO) in Australia, developed GT-Scan2, a novel software tool to address this issue.

“Compared to other available methods, GT-Scan2 identifies genomic location with higher sensitivity and specificity,” says Dr. Denis Bauer who is leading the transformational bioinformatics team.

GT-Scan2 shows the identified CRISPR target sites at the genomic position and annotates them with high or low activity as well as their off-target potential.

GT-Scan2 improves the effectiveness of the system by finding sites that are unique in the genome. This avoids diluting the effect due to “off-target”, which are other sites in the genome with high sequence similarity. It also optimizes robustness by finding sites that are easier to modify.

“While it was known that the three-dimensional genome organization plays a role in CRISPR binding, GT-Scan2 is the first tool to also leverage other components that are crucial for Cas9 activity,” says Dr. Laurence Wilson whose research focuses on computational genome engineering.

Specifically the off-target search is a compute intensive task traditionally reserved for researchers at large institutes with high-performance-compute infrastructure as every location in the 3 billion letter long genomic sequence needs to be investigated. GT-Scan2 democratizes the ability to find optimal sites by offering this complex computation as a cloud-service using AWS Lambda functions.

Scaling Instantaneously for Personalized Treatments
GT-Scan2 leverages the instantaneous scalability that the event-driven AWS Lambda service offers. This is crucial for personalized treatment, as complexity of the targeted gene can vary dramatically.

“The off-target search as well as the robustness analysis can be subdivided into independent, modular tasks that can run in parallel” says Aidan O’Brien who designed and implemented the system within weeks after its official Asia-Pacific launch in April this year at the AWS Summit 2016 attesting to the intuitive nature of the service. A typical job takes less than a minute and the variation between jobs range from 1 second to 5 minutes. This fast fluctuation in load over minutes rather than hours ruled out an EC2-based solution as new instances would come online too slowly to keep the runtime stable.

GT-Scan2 is served directly from S3 making it a static web app without server-side processing. It retrieves the dynamic content (such as job results and parameters) via API calls using API Gateway from a database (DynamoDB) using a JavaScript framework.

When a user submits a job, GT-Scan2 inserts the job parameters as an item into a DynamoDB table via an API call. This allows the solution to be freely scalable without creating a bottleneck. The database entry triggers the first Lambda function, which finds all putative CRISPR targets in the user-specified DNA sequence (fetched automatically upon user submission). Potential CRISPR target sites have fixed rules and can be easily found using a regular expression that completes in seconds and are inserted into a second DynamoDB table.

Adapting to leverage the power of Lambda-based microservices

All potential targets need to be evaluated for their off-target risk using the efficient string matching tool, Bowtie. Though Bowtie only requires a reduced representation of the 3 billion letter genomic sequence, the sizes of these index files exceed the storage limitation for each Lambda instance. “GT-Scan2 divides the genome into smaller blocks to fit the Lambda specifications” explains Adrian White (Research & Technical Computing, APAC) who supported the CSIRO team during development. For an average run, GT-Scan2 hence triggers 500-1000 individual Lambda functions, which simultaneously update the scores for the different putative targets in DynamoDB. During this process, the frontend is polling this table via API Gateway and updating the webpage as results come in, eliminating the need for server-side compute.

“AWS’s Lambda has given us a great framework to develop a future-ready software package able to support medical genome engineering applications,” says Dr. Bauer. “We are specifically impressed with the ability to instantaneously scale at run time by spawning more Lambda functions to cope with the varying complexity of the different genes.” Other benefits Dr. Bauer quotes include only paying for storage during periods of no use and jobs not competing with web server resources as the website is a static page with dynamic content updated through Angular 2 and the API Gateway, as well as not needing to maintain compute instances (security patches of OS).

“One of the best things about Lambda is that users will be able to easily swap-in different machine learning algorithms that are better suited for specific CRISPR applications” says Dr. Wilson.

The GT-Scan2 Team, from left, Denis Bauer, Laurence Wilson, Aidan O’Brien

“The computational genome engineering community is one of the early adopters of our AWS Lambda technology,” explains Dr. Mia Champion (Technical Business Development Manager, Scientific Computing). “GT-Scan2’s use of API Gateway and DynamoDB is a very neat solution to ensure scalability and their clever use of epigenomics really sets them apart from other recent applications using lambda to perform CRISPR searches. I am looking forward to seeing GT-Scan2 adopted in medical applications.”

Congratulations to the Winners of the Serverless Chatbot Competition!

by Jeff Barr | on | in Amazon API Gateway, AWS Lambda, Developers |

I announced the AWS Serverless Chatbot Competion in August and invited you to build a chatbot for Slack using AWS Lambda and Amazon API Gateway.

Last week I sat down with fellow judges Tim Wagner (General Manager of AWS Lambda) and Cecilia Deng (a Software Development Engineer on Tim’s team) to watch the videos and to evaluate all 62 submissions. We were impressed by the functionality and diversity of the entrees, as well as the efforts that the entrants put in to producing attractive videos to show their submissions in action.

After hours of intense deliberation we chose a total of 9 winners: 8 from individuals, teams & small organizations and one from a larger organization. Without further ado, here you go:

Individuals, Teams, and Small Organizations
Here are the winners of the Serverless Slackbot Hero Award. Each winner receives one ticket to AWS re:Invent, access to discounted hotel room rates, public announcement and promotion during the Serverless Computing keynote, some cool swag, and $100 in AWS Credits. You can find the code for many of these bots on GitHub. In alphabetical order, the winners are:

AWS Network Helper“The goal of this project is to provide an AWS network troubleshooting script that runs on a serverless architecture, and can be interacted with via Slack as a chat bot.GitHub repo.

B0pb0t – “Making Mealtime Awesome.” GitHub repo.

Borges – “Borges is a real-time translator for multilingual Slack teams.” GitHub repo.

CLIve – “CLIve makes managing your AWS EC2 instances a doddle. He understands natural language, so no need to learn a new CLI!”

Litlbot – “Litlbot is a Slack bot that enables realtime interaction with students in class, creating a more engaged classroom and learning experience.” GitHub repo.

Marbot – “Forward alerts from Amazon Web Services to your DevOps team.”

Opsidian – “Collaborate on your AWS infra from Slack using natural language.”

ServiceBot – “Communication platform between humans, machines, and enterprises.” GitHub repo.

Larger Organization
And here’s the winner of the Serverless Slackbot Large Organization Award:

Eva – “The virtual travel assistant for your team.” GitHub repo.

Thanks & Congratulations
I would like to personally thank each of the entrants for taking the time to submit their entries to the competition!

Congratulations to all of the winners; I hope to see you all at AWS re:Invent.

Jeff;

 

PS – If this list has given you an idea for a chatbot of your very own, please watch our Building Serverless Chatbots video and take advantage of our Serverless Chatbot Sample.

Amazon Aurora Update – Call Lambda Functions From Stored Procedures; Load Data From S3

by Jeff Barr | on | in Amazon Aurora, Amazon S3, AWS Lambda |

Many AWS services work just fine by themselves, but even better together! This important aspect of our model allows you to select a single service, learn about it, get some experience with it, and then extend your span to other related services over time. On the other hand, opportunities to make the services work together are ever-present, and we have a number of them on our customer-driven roadmap.

Today I would like to tell you about two new features for Amazon Aurora, our MySQL-compatible relational database:

Lambda Function Invocation – The stored procedures that you create within your Amazon Aurora databases can now invoke AWS Lambda functions.

Load Data From S3 – You can now import data stored in an Amazon Simple Storage Service (S3) bucket into a table in an Amazon Aurora database.

Because both of these features involve Amazon Aurora and another AWS service, you must grant Amazon Aurora permission to access the service by creating an IAM Policy and an IAM Role, and then attaching the Role to your Amazon Aurora database cluster. To learn how to do this, see Authorizing Amazon Aurora to Access Other AWS Services On Your Behalf.

Lambda Function Integration
Relational databases use a combination of triggers and stored procedures to enable the implementation of higher-level functionality. The triggers are activated before or after some operations of interest are performed on a particular database table. For example, because Amazon Aurora is compatible with MySQL, it supports triggers on the INSERT, UPDATE, and DELETE operations. Stored procedures are scripts that can be run in response to the activation of a trigger.

You can now write stored procedures that invoke Lambda functions. This new extensibility mechanism allows you to wire your Aurora-based database to other AWS services. You can send email using Amazon Simple Email Service (SES), issue a notification using Amazon Simple Notification Service (SNS), insert publish metrics to Amazon CloudWatch, update a Amazon DynamoDB table, and more.

At the appliction level, you can implement complex ETL jobs and workflows, track and audit actions on database tables, and perform advanced performance monitoring and analysis.

Your stored procedure must call the mysql_lambda_async procedure. This procedure, as the name implies, invokes your desired Lambda function asynchronously, and does not wait for it to complete before proceeding. As usual, you will need to give your Lambda function permission to access any desired AWS services or resources.

To learn more, read Invoking a Lambda Function from an Amazon Aurora DB Cluster.

Load Data From S3
As another form of integration, data stored in an S3 bucket can now be imported directly in to Aurora (up until now you would have had to copy the data to an EC2 instance and import it from there).

The data can be located in any AWS region that is accessible from your Amazon Aurora cluster and can be in text or XML form.

To import data in text form, use the new LOAD DATA FROM S3 command. This command accepts many of the same options as MySQL’s LOAD DATA INFILE, but does not support compressed data. You can specify the line and field delimiters and the character set, and you can ignore any desired number of lines or rows at the start of the data.

To import data in XML form,  use the new LOAD XML from S3 command. Your XML can look like this:

<row column1="value1" column2="value2" />
...
<row column1="value1" column2="value2" />

Or like this:

<row>
  <column1>value1</column1>
  <column2>value2</column2>
</row>
...

Or like this:

<row>
  <field name="column1">value1</field>
  <field name="column2">value2</field>
</row>
...

To learn more, read Loading Data Into a DB Cluster From Text Files in an Amazon S3 Bucket.

Available Now
These new features are available now and you can start using them today!

There is no charge for either feature; you’ll pay the usual charges for the use of Amazon Aurora, Lambda, and S3.

Jeff;

 

Running Express Applications on AWS Lambda and Amazon API Gateway

by Jeff Barr | on | in Amazon API Gateway, AWS Lambda |

Express is a web framework for Node.js . It simplifies the development of “serverless” web sites, web applications, and APIs. In a serverless environment, most or all of the backend logic is run on-demand in a stateless fashion (see Mike Roberts’ article on Serverless Architectures for a more detailed introduction). AWS Lambda, when used in conjunction with the new Amazon API Gateway features that I blogged about earlier this month (API Gateway Update – New Features Simplify API Development), allows existing Express applications to be run in serverless fashion. When you use API Gateway you have the opportunity to take advantage of additional features such as Usage Plans which allow you to build a developer ecosystem around your APIs, and caching which allows you to build applications that are responsive and cost-effective.

In order to help you to migrate your Express applications to Lambda and API Gateway, we have created the aws-serverless-express package. This package contains a working example that you can use as a starting point for your own work.

I have two resources that you can use to migrate your Express code and applications to API Gateway and Lambda:

Jeff;

Yemeksepeti: Our Shift to Serverless Architecture

by Jeff Barr | on | in AWS Lambda, Customer Success, Guest Post |

AWS Community Hero Onur Salk wrote the guest post below in order to tell you how he helped his employer to move to a serverless architecture.

Jeff;


I’m Onur Salk, AWS Community Hero, AWS Certified Solutions Architect – Professional, and organizer of the AWS user group in Turkey. As a Hero, I like to share my AWS experience and knowledge with the community on my personal blog and through meetups with the community. Today I want to share the story behind Yemeksepeti and our shift to serverless architecture.

The Story Behind Yemeksepeti
Yemeksepeti is the biggest online food ordering company in Turkey. It lets users place food orders from affiliated network restaurants without charging any extra fees. At Yemeksepeti, we needed to set up a globally distributed service that is scalable, high-performing, and cost-effective. Our belief is that by designing a serverless architecture, we won’t have to worry about managing our servers and can remove a lot of operational burdens from our team. This means we can focus on running our code at scale.

At Yemeksepeti.com, we developed a real-time discount system called Joker about four years ago. The purpose of this system is to suggest discounts to customers that they normally cannot find for restaurants. The original Joker platform was developed in .NET and then integrated with the website and mobile devices using its REST API. We were asked to open the platform’s API to our sister companies operating in 34 countries, so that they can also provide real-time Joker discounts to their customers.

Initially, we thought we would share our code and let them integrate their applications. However, most of other countries were using different technology stack (programming languages, database, and so on). Although using our code might accelerate their development at first,  they would have to maintain an unfamiliar system. We needed to find an integration method that was easier to implement and cheaper to maintain.

Our Requirements
This was a global project, and these were our five focus areas:

  • Ease of management
  • High availability
  • Scalability
  • Use in several regions
  • Cost advantage

We evaluated these focus areas against several different processing models and came up with the following matrix:

Ease of Management High Availability Scalability Use in Several Regions Cost Advantage
IaaS

We could spin up some EC2 instances running IIS on top of Microsoft Windows Server and connected to an RDS DB instance.
 No. We need to take care of our servers.  Yes. We can distribute our servers to different AZs.  Yes. We can use Auto Scaling  Yes. We can use AMIs and copy between regions  Partially. There will be license fees and costs for running EC2 instances .
 PaaS

We could use AWS Elastic Beanstalk.
 Partially. We need to take care of our servers.  Yes. We can distribute our servers to different AZs.  Yes. We can use Auto Scaling.  Yes. We can use environment configurations, AMIs, etc.  Partially. There will be license fees and costs for running EC2 instances.
 FaaS

We could use AWS Lambda.
 Yes. AWS takes care of the services.  Yes. It is already highly available  Yes. It performs at any scale  Yes. We can export/import/upload our configurations easily.  Yes. There are no licenses and we pay only for what we use.

We decided to use Faas (Functions as a Service). We started our project in the EU (Ireland) regions using the following services:

Architecture
Our architecture looks like this:

Amazon VPC: We use Amazon VPC to launch our resources in our private network.

Amazon API Gateway: During the development phase, we started to develop the service in the EU (Ireland) region. At that time, AWS Lambda was not available in EU (Frankfurt). We created two APIs: one for web integration and the other for the admin interface. We used custom authorizers with JSON Web Tokens (JWT) to enable token-based authorization for our APIs. We used mapping templates to pass our variables to our Lambda functions.

In the development phase, there was only a test stage for each API.

During the production phase, AWS Lambda became available in Frankfurt. We decided to move the service there to benefit from low latency access from Turkey. We used the API Gateway Export API feature to export our configuration in Swagger format, and then imported it into Frankfurt. (Before the import, we changed the region definitions in the exported file to eu-central-1.) After that, we created a production stage and used stage variables to parameterize our database definitions of the Amazon RDS instances (like host, username, and so on). We also wanted to use our custom domain name. After we bought an SSL certificate for our domain, we created a custom domain name in the Amazon API Gateway console and created an alias for our CloudFront distribution name (Amazon API Gateway uses Amazon CloudFront in the background). Finally, we created an IAM role to enable Amazon CloudWatch logging for API calls, latency, and more.

Metrics for Get_Joker_offer resource:

AWS Lambda: During the development phase, we used Python to develop our service and created 65 functions for integrating our API methods and scheduled tasks using CloudWatch Events Lambda triggers. Lambda VPC integration became available during the production phase, so we uploaded our functions to the Frankfurt region and integrated them with VPC.

Invocation count of Get_joker_offer Lambda function (The peaks correspond to lunch and dinner times (when people are hungry)):

Amazon RDS: During the development phase, we chose to use Amazon RDS for PostgreSQL. We created a single-AZ RDS instance to test our service. During the production phase, we needed to move our database because we migrated our APIs and functions to Frankfurt. We created a snapshot of our instance and using the Copy snapshot feature of RDS, we successfully moved our database. We launched two instances in our VPC: a multi-AZ instance for production and a single-AZ instance for test purposes. In our API stage variables, we defined the endpoint names of our RDS instances to map the staging to the appropriate instance. We also enabled automated backups for both instances.

Amazon S3: The Joker platform has an admin panel that’s used for managing and reporting Joker offers. To host this administration interface, which is basically a Single Page Application (SPA) with AngularJS, we used the static website hosting feature of Amazon S3. All of the logic and functionality is provided by methods running on Lambda, so we didn’t need a server for the admin interface:

Amazon CloudWatch: We use the service to monitor the usage of our important assets and to get alerts if something goes wrong. We created a custom dashboard to monitor the CPU of our production database, connection count, critical API latencies and function counts and durations.

In our Python code, we log the durations of each inner method in CloudWatch to track performance and find any bottlenecks:

Here’s our CloudWatch dashboard:

Amazon ElasticSearch: During the development phase, Cloudwatch Logs streaming to Amazon ES became available in the Ireland region. Using this feature, we created a Kibana dashboard to monitor some other metrics from the logs we generate from our code. As soon as Amazon ES is available in the Frankfurt region, we will use it again.

Initial Results
The Joker system is in production now, as a pilot for a small region of a country. As you can see from the following chart, the growth of the number of orders is promising. By leveraging serverless architecture, we didn’t have to install and manage an operating system and dependencies. Using Amazon API Gateway, AWS Lambda, Amazon S3, and Amazon RDS, our architecture runs in a highly available environment. We don’t need to learn and manage any master-slave replication features or third-party tools. As our service gets more requests, AWS Lambda adds more Lambda instance, so it runs at any scale. We are able to copy our service to another region using the features of AWS services as we did before going into production. Finally, we don’t run any servers, so we benefit from the cost advantage of serverless architecture.

Here is a representation of the number of orders placed through Joker:

What’s Next
We hope this service will spread to all 34 of the sister companies within Delivery Hero Holding. As the service is rolled out globally, we will deploy to other AWS regions. We plan to choose the region nearest to the company. To optimize our costs, we will purchase reserved instances for our RDS instances. Also, as we monitor our inner methods’ duration, we can re-factor and optimize our code and so that we can decrease our Lambda functions’ execution times.

We believe the future of the cloud is FaaS. We would like to experiment more as other features, services,  and functions become available.

As an AWS Community Hero, I look forward to sharing the Yemeksepeti story with the AWS user group in Turkey. I’d like to help people explore and leverage serverless architecture.

— Onur Salk

 

Enter the AWS Serverless Chatbot Competition!

by Jeff Barr | on | in Amazon API Gateway, AWS Lambda, Developers |

Earlier this month I was asked to be a judge for the AWS Serverless Chatbot Competition and I was happy to accept!

Build a Chatbot
We want you to build a chatbot for Slack using AWS Lambda and Amazon API Gateway. You can also use other APIs (the Slack Events API will be helpful), additional services (AWS or otherwise),  and other data sources.  Your entry should be creative and original, and should provide genuine value to Slack users.

The AWS Free Tier provides you with access to Lambda, API Gateway, and other AWS services at no charge. New and existing AWS users get 1 million Lambda requests and 400,000 GB-seconds of compute time for free. New AWS users get 1 million API calls per month to the API Gateway for up to 12 months after signing up.

Your Entry
After you build your chatbot, I would encourage you to spend some time on the packaging and the marketing. Be sure to supply the following items as part of your entry:

  • Demonstration video of your chatbot in action.
  • Brief explanation of what it does and what makes it unique.
  • Link to your public or private GitHub repo (include all files needed to run your chatbot).
  • Instructions for testing and using your chatbot.

Your chatbot can be a new or existing application, but it must use Lambda and API Gateway. The code must function as shown in the video, and the submission must be in English. It also has to be substantially different from any other app submitted by the same submitter, team, or organization.

Prizes
Individuals, teams, and organizations with 50 or less employees are eligible for the Serverless Chatbot Hero Award. This award includes one ticket to AWS re:Invent and access to discounted hotel room rates, along with recognition at the Serverless State of the Union address, some cool swag, $100 in AWS credits, and publicity opportunities for their bot. We will make up to eight of these awards.

Larger organizations are eligible for a non-cash, recognition-only award.

See the FAQ for more information on rules and eligibility.

Timeline
Here’s the timeline for the competition:

  • August 10 – September 29, 2016 – Submission period.
  • October 3-7 – AWS Judging.
  • October 15 – Winners announced.

Get Started Now
Here’s what you need to do to get started:

  1. Read the Rules and Eligibility Guidelines.
  2. Register for the AWS Serverless Chatbot Competition.
  3. Create AWS and Slack developer accounts.
  4. Visit the Resources Page to learn more about the APIs and services.
  5. Build your chatbot. Our sample code (aws-serverless-chatbot-sample) is a good place to start.
  6. Create your demo video and your other materials for the submission.
  7. Submit your materials before 5 PM ET on September 29, 2016.

I’m looking forward to seeing, using, and judging the entries!

Relevant Webinars
Here are some upcoming webinars that will help you to get started:

You may also find these recordings of past webinars to be helpful:

Jeff;

Arduino Web Editor and Cloud Platform – Powered by AWS

by Jeff Barr | on | in Amazon Internet of Things, Announcements, AWS Lambda |

Last night I spoke with Luca Cipriani from Arduino to learn more about the new AWS-powered Arduino Web Editor and Arduino Cloud Platform offerings. Luca was en-route to the Bay Area Maker Faire and we had just a few minutes to speak, but that was enough time for me to learn a bit about what they have built.

If you have ever used an Arduino, you know that there are several steps involved. First you need to connect the board to your PC’s serial port using a special cable (you can also use Wi-Fi if you have the appropriate add-on “shield”), ensure that the port is properly configured, and establish basic communication. Then you need to install, configure, and launch your development environment, make sure that it can talk to your Arduino, tell it which make and model of Arduino that you are using, and select the libraries that you want to call from your code. With all of that taken care of, you are ready to write code, compile it, and then download it to the board for debugging and testing.

Arduino Code Editor
Luca told me that the Arduino Code Editor was designed to simplify and streamline the setup and development process. The editor runs within your browser and is hosted on AWS (although we did not have time to get in to the details, I understand that they made good use of AWS Lambda and several other AWS services).

You can write and modify your code, save it to the cloud and optionally share it with your colleagues and/or friends. The editor can also detect your board (using a small native plugin) and configure itself accordingly; it even makes sure that you can only write code using libraries that are compatible with your board. All of your code is compiled in the cloud and then downloaded to your board for execution.

Here’s what the editor looks like (see Sneak Peek on the New, Web-Based Arduino Create for more):

Arduino Cloud Platform
Because Arduinos are small, easy to program, and consume very little power, they work well in IoT (Internet of Things) applications. Even better, it is easy to connect them to all sorts of sensors, displays, and actuators so that they can collect data and effect changes.

The new Arduino Cloud Platform is designed to simplify the task of building IoT applications that make use of Arduino technology. Connected devices will be able to be able to connect to the Internet, upload information derived from sensors, and effect changes upon command from the cloud. Building upon the functionality provided by AWS IoT, this new platform will allow devices to communicate with the Internet and with each other. While the final details are still under wraps, I believe that this will pave the wave for sensors to activate Lambda functions and for Lambda functions to take control of displays and actuators.

I look forward to learning more about this platform as the details become available!

Jeff;

 

Machine Learning, Recommendation Systems, and Data Analysis at Cloud Academy

by Jeff Barr | on | in Amazon RDS, Amazon S3, AWS Lambda, Guest Post |

In today’s guest post, Alex Casalboni and Giacomo Marinangeli of Cloud Academy discuss the design and development of their new Inspire system.

Jeff;


Our Challenge
Mixing technology and content has been our mission at Cloud Academy since the very early days. We are builders and we love technology, but we also know content is king. Serving our members with the best content and creating smart technology to automate it is what kept us up at night for a long time.

Companies are always fighting for people’s time and attention and at Cloud Academy, we face those same challenges as well. Our goal is to empower people, help them learn new Cloud skills every month, but we kept asking ourselves: “How much content is enough? How can we understand our customer’s goals and help them select the best learning paths?”

With this vision in mind about six months ago we created a project called Inspire which focuses on machine learning, recommendation systems and data analysis. Inspire solves our problem on two fronts. First, we see an incredible opportunity in improving the way we serve our content to our customers. It will allow us to provide better suggestions and create dedicated learning paths based on an individual’s skills, objectives and industries. Second, Inspire represented an incredible opportunity to improve our operations. We manage content that requires constant updates across multiple platforms with a continuously growing library of new technologies.

For instance, getting a notification to train on a new EC2 scenario that you’re using in your project can really make a difference in the way you learn new skills. By collecting data across our entire product, such as when you watch a video or when you’re completing an AWS quiz, we can gather that information to feed Inspire. Day by day, it keeps personalising your experience through different channels inside our product. The end result is a unique learning experience that will follow you throughout your entire journey and enable a customized continuous training approach based on your skills, job and goals.

Inspire: Powered by AWS
Inspire is heavily based on machine learning and AI technologies, enabled by our internal team of data scientists and engineers. Technically, this involves several machine learning models, which are trained on the huge amount of collected data. Once the Inspire models are fully trained, they need to be deployed in order to serve new predictions, at scale.

Here the challenge has been designing, deploying and managing a multi-model architecture, capable of storing our datasets, automatically training, updating and A/B testing our machine learning models, and ultimately offering a user-friendly and uniform interface to our website and mobile apps (available for iPhone and Android).

From the very beginning, we decided to focus high availability and scalability. With this in mind, we designed an (almost) serverless architecture based on AWS Lambda. Every machine learning model we build is trained offline and then deployed as an independent Lambda function.

Given the current maximum execution time of 5 minutes, we still run the training phase on a separate EC2 Spot instance, which reads the dataset from our data warehouse (hosted on Amazon RDS), but we are looking forward to migrating this step to a Lambda function as well.

We are using Amazon API Gateway to manage RESTful resources and API credentials, by mapping each resource to a specific Lambda function.

The overall architecture is logically represented in the diagram below:

Both our website and mobile app can invoke Inspire with simple HTTPS calls through API Gateway. Each Lambda function logically represents a single model and aims at solving a specific problem. More in detail, each Lambda function loads its configuration by downloading the corresponding machine learning model from Amazon S3 (i.e. a serialized representation of it).

Behind the scenes, and without any impact on scalability or availability, an EC2 instance takes care of periodically updating these S3 objects, as outcome of the offline training phase.

Moreover, we want to A/B test and optimize our machine learning models: this is transparently handled in the Lambda function itself by means of SixPack, an open-source A/B testing framework which uses Redis.

Data Collection Pipeline
As far as data collection is concerned, we use Segment.com as data hub: with a single API call, it allows us to log events into multiple external integrations, such as Google Analytics, Mixpanel, etc. We also developed our own custom integration (via webhook) in order to persistently store the same data in our AWS-powered data warehouse, based on Amazon RDS.

Every event we send to Segment.com is forwarded to a Lambda function – passing through API Gateway – which takes care of storing real-time data into an SQS queue. We use this queue as a temporary buffer in order to avoid scalability and persistency problems, even during downtime or scheduled maintenance. The Lambda function also handles the authenticity of the received data thanks to a signature, uniquely provided by Segment.com.

Once raw data has been written onto the SQS queue, an elastic fleet of EC2 instances reads each individual event – hence removing it from the queue without conflicts – and writes it into our RDS data warehouse, after performing the required data transformations.

The serverless architecture we have chosen drastically reduces the costs and problems of our internal operations, besides providing high availability and scalability by default.

Our Lambda functions have a pretty constant average response time – even during load peaks – and the SQS temporary buffer makes sure we have a fairly unlimited time and storage tolerance before any data gets lost.

At the same time, our machine learning models won’t need to scale up in a vertical or distributed fashion since Lambda takes care of horizontal scaling. Currently, they have an incredibly low average response time of 1ms (or less):

We consider Inspire an enabler for everything we do from a product and content perspective, both for our customers and our operations. We’ve worked to make this the core of our technology, so that its contributions can quickly be adapted and integrated by everyone internally. In the near future, it will be able to independently make decisions for our content team while focusing on our customers’ need.  At the end of the day, Inspire really answers our team’s doubts on which content we should prioritize, what works better and exactly how much of it we need. Our ultimate goal is to improve our customer’s learning experience by making Cloud Academy smarter by building real intelligence.

Join our Webinar
If you would like to learn more about Inspire, please join our April 27th webinar – How we Use AWS for Machine Learning and Data Collection.

Alex Casalboni, Senior Software Engineer, Cloud Academy
Giacomo Marinangeli, CTO, Cloud Academy

PS – Cloud Academy is hiring – check out our open positions!

Using Enhanced RDS Monitoring with Datadog

by Jeff Barr | on | in Amazon CloudWatch, AWS Lambda, Guest Post |

Today’s guest post comes from K Young, Director of Strategic Initiatives at Datadog!

Jeff;


AWS recently announced enhanced monitoring for Amazon RDS instances running MySQL, MariaDB, and Aurora. Enhanced monitoring includes over 50 new CPU, memory, file system, and disk I/O metrics which can be collected on a per-instance basis as frequently as once per second.

AWS and Datadog
AWS worked closely with Datadog to help customers send this new high-resolution data to Datadog for monitoring. Datadog is an infrastructure monitoring platform that is very popular with AWS customers—you can see historical trends with full granularity and also visualize and alert on live data from any part of your stack.

With a few minutes of work your enhanced RDS metrics will immediately begin populating a pre-built, customizable dashboard in Datadog:

Connect RDS and Datadog
The first step is to send enhanced RDS metrics to CloudWatch Logs. You can enable the metrics during instance creation, or on an existing RDS instance by selecting it in the RDS Console and then choosing Instance OptionsModify:

Set Granularity to 1–60 seconds; every 15 seconds is often a good choice. Once enabled, enhanced metrics will be sent to CloudWatch Logs.

The second step is to send the CloudWatch Log data to Datadog. Begin by setting up a Lambda function to process the logs and send the metrics:

  1. Create a role for your Lambda function. Name it something like lambda-datadog-enhanced-rds-collector and select AWS Lambda as the role type.
  2. From the Encryption Keys tab on the IAM Management Console, create a new encryption key. Enter an Alias for the key like lambda-datadog-key. On the next page, add the appropriate administrators for the key. Next you’ll be prompted to add users to the key. Add at least two: yourself (so that you can encrypt the Datadog API key from the AWS CLI in the next step), and the role created above, e.g. lambda-datadog-enhanced-rds-collector (so that it can decrypt the API key and submit metrics to Datadog). Finish creating the key.
  3. Encrypt the token using the AWS Command Line Interface (CLI), providing the Alias of your just-created key (e.g. lambda-datadog-key) as well as your Datadog keys, available here. Use KMS to encrypt your key, like this:
    $ aws kms encrypt --key-id alias/ALIAS_KEY_NAME --plaintext '{"api_key":"DATADOG_API_KEY", "app_key":"DATADOG_APP_KEY"}'

    Save the output of this command; you will need it for the next step.

  4. From the Lambda Management Console, create a new Lambda Function. Filter blueprints by datadog, and select the datadog-process-rds-metrics blueprint.
  5. Choose RDSOSMetrics from the Log Group dropdown, enter the Filter Name of your choice, and go to the next page. If you have not yet enabled enhanced monitoring, you must do so before RDSOSMetrics will be presented an as option (see the instructions under Connect RDS and Datadog above):
  6. Give your function a name like send-enhanced-rds-to-datadog. In the Lambda function code area, replace the string after KMS_ENCRYPTED_KEYS with the ciphertext blob part of the CLI command output above.
  7. Under Lambda function handler and role, choose the role you created in step 2, e.g. lambda-datadog-enhanced-rds-collector. Go to the next page, select the Enable Now radio button, and create your function.

That’s It
Once you have enabled RDS in Datadog’s AWS integration tile, Datadog will immediately begin displaying your enhanced RDS metrics. Your RDS instances will be individually identifiable in Datadog via automatically-created tags of the form dbinstanceidentifier:YOUR_DB_INSTANCE_NAME, as well as any tags you added through the RDS console.

You can clone the pre-built dashboard and customize it however you want: add RDS metrics that are not displayed by default, or start correlating RDS metrics with the performance of the rest of your stack.

— K Young, Director of Strategic Initiatives