AWS AI Blog

In the Research Spotlight: Zornitsa Kozareva

by Victoria Kouyoumjian | on | Permalink | Comments |  Share

As AWS continues to support the Artificial Intelligence (AI) community with contributions to Apache MXNet and the release of Amazon Lex, Amazon Polly, and Amazon Rekognition managed services, we are also expanding our team of AI experts, who have one primary mission: To lower the barrier to AI for all AWS developers, making AI more accessible and easy to use. As Swami Sivasubramanian, VP of Machine Learning at AWS, succinctly stated, “We want to democratize AI.”

In our Research Spotlight series, I spend some time with these AI team members for in-depth conversations about their experiences and get a peek into what they’re working on at AWS.


Dr. Zornitsa Kozareva joined AWS in June, 2016, as the Manager of Applied Science for Deep Learning, focusing on natural language processing (NLP) and dialog applications. Zornitsa is a recipient of the John Atanasoff Award, which was given to her by the President of the Republic of Bulgaria in 2016 for her contributions and impact in science, education, and industry; the Yahoo! Labs Excellence Award in 2014; and the RANLP Young Researcher Award in 2011. You can read more about Dr. Kozareva on her website, or visit Google Scholar to find her 80 papers and 1464 citations.

Getting into the field of natural language processing

Zornitsa’s interest in the field of natural language processing dates back to 2003, when she was doing her undergraduate studies in computer science in her native Bulgaria. In her third year of undergrad, she applied to the Leonardo Da Vinci Program, which is funded by the European Commission. She was selected to conduct research on multilingual information retrieval at the New University of Lisbon, Portugal. “This was a really great experience. I learned how to build a search engine; how to innovate, write, and publish scientific papers; and, most importantly, how to share my findings with the rest of the research community. For an undergrad such as myself, this opened my eyes to a brand new horizon.”

From that moment, Zornitsa says that she was “mesmerized by machine learning and its ability to solve natural language problems. I became super passionate about the field and I decided that I wanted to pursue a PhD in NLP.”

In 2004, Zornitsa went to Spain for graduate studies, where she worked on “a wide spectrum of topics, including information extraction, semantics, and question answering. This is how my career in NLP started.”

While working toward her PhD, Zornitsa had the opportunity to do a full-year internship. “I picked the Information Sciences Institute, located in Los Angeles, because I wanted to work with world-renowned leaders in the NLP field, such as Dr. Eduard Hovy. For a year, I worked with Dr. Hovy and Dr. Ellen Riloff conducting research on knowledge extraction. It was a great learning experience, and I also received valuable career advice. Right after I graduated, I decided that I wanted to come back to the US and continue to enhance my scientific career.”

(more…)

Build a Real-time Object Classification System with Apache MXNet on Raspberry Pi

by Aran Khanna | on | Permalink | Comments |  Share

In the past five years, deep neural networks have solved many computationally difficult problems, particularly in the field of computer vision. Because deep networks require a lot of computational power to train, often using tens of GPUs, many people assume that you can run them only on powerful cloud servers. In fact, after a deep network model has been trained, it needs relatively few computational resources to run predictions. This means that you can deploy a model on lower-powered edge (non-cloud) devices and run it without relying on an internet connection.

Enter Apache MXNet, Amazon’s open source deep learning engine of choice. In addition to effectively handling multi-GPU training and deployment of complex models, MXNet produces very lightweight neural network model representations. You can deploy these representations on devices with limited memory and compute power. This makes MXNet perfect for running deep learning models on devices like the popular $35 Raspberry Pi computer.

In this post, we walk through creating a computer vision system using MXNet for the Raspberry Pi. We also show how to use AWS IoT to connect to the AWS Cloud. This allows you to use the Cloud to manage a lightweight convolutional neural network running real-time object recognition on the Pi.

Prerequisites

To follow this post, you need a Raspberry Pi 3 Model B device running Jessie or a later version of the Raspbian operating system, the Raspberry Pi Camera Module v2, and an AWS account.

Setting up the Raspberry Pi

First, you set up the Pi with the camera module to turn it into a video camera, and then install MXNet. This allows you to start running deep network-based analysis on everything that the Pi “sees.”

Set up your Pi with the Camera Module and connect the device to the Internet, either through the Ethernet port or with WiFi. Then, open the terminal and type the following commands to install the Python dependencies for this post:

sudo apt-get update
sudo apt-get install python-pip python-opencv python-scipy \
python-picamera

Build MXNet for the Pi with the corresponding Python bindings by following the instructions for Devices. For this tutorial, you won’t need to build MXNet with OpenCV.

(more…)

“Greetings, visitor!” — Engage Your Web Users with Amazon Lex

by Niranjan Hira | on | Permalink | Comments |  Share

All was well with the world last night. You went to bed thinking about convincing your manager to add some time in the next sprint for much-needed improvements to the recommendation engine for shoppers on your website. The machine learning models are out of date and people are complaining, but no one is looking past the one-off tickets that stream in every day. You wake up to the usual flurry of email.

But what’s this? You learn that the Chief Marketing Officer is at an industry conference where she’s heard the buzz about conversational experiences. She just tried out some chatbots, and now she wants one for the site. She wants to connect with shoppers one-on-one to offer them a personalized experience. That’s a fun technology problem. As long as the management team hires someone to help with the look and feel, you can focus on the fun part of putting the chatbot together.

In this post, we show how easy it is to create a chatbot and a personalized web experience for your customers using Amazon Lex and other AWS services.

What do you need to prove?

Personalized experience covers a lot of ground, but you have ideas. You could create a virtual shopping assistant that can answer questions about products; check colors, styles, and pricing; offer product recommendations; bring up relevant deals; remember shopping preferences; look up ratings and reviews–and, of course, you’ll need to look up the most useful and recent reviews first–or wait … maybe even talk about what the Twittiverse thinks. But you have to nail basic stuff like “Do you have this in red?,” “Where can I get it?,” and “What’s the return policy?.”

Basically, you need to prove:

  1. That you can build a bot quickly (check, you have Amazon Lex for that)
  2. That you can integrate your bot with the site (and later on, you might use AWS Lambda to connect to other apps)
  3. That it’s easy to monitor the bot and update it (you’re not really sure about this one)

For starters, you decide to keep it simple. You decide to build an example bot using Amazon Lex, wire it up to static HTML, connect it to a stub service, and see what it takes to update the bot. This is going to be fun!

Build an Amazon Lex bot

The specific bot isn’t important. You just want to make sure that you can put together a web experience that integrates with a service on the backend. You can start with the Amazon Lex BookTrip example. It takes a couple of minutes, but when you’re done, you’re ready to test the “Return parameters to client” (no code hooks yet) version of the bot. San Francisco for two nights, anyone?

Next, you follow the instructions to use a blueprint to create a Lambda function (BookTripCodeHook) that will serve as the code hook for initialization, data validation, and fulfillment activities. You use the Test events from the Sample event template list to confirm that the code works as expected and that you don’t have any setup or permissions issues.

(more…)

In the Research Spotlight: Hassan Sawaf

by Victoria Kouyoumjian | on | Permalink | Comments |  Share

As AWS continues to support the Artificial Intelligence (AI) community with contributions to Apache MXNet and the release of Amazon Lex, Amazon Polly, and Amazon Rekognition managed services, we are also expanding our team of AI experts, who have one primary mission: To lower the barrier to AI for all AWS developers, making AI more accessible and easy to use. As Swami Sivasubramanian, VP of Machine Learning at AWS, succinctly stated, “We want to democratize AI.”

In our Research Spotlight series, I spend some time with these AI team members for in-depth conversations about their experiences and get a peek into what they’re working on at AWS.


Hassan Sawaf has been with Amazon since September 2016. This January, he joined AWS as Director of Applied Science and Artificial Intelligence.

Hassan has worked in the automatic speech recognition, computer vision, natural language understanding, and machine translation fields for 20+ years. In 1999, he cofounded AIXPLAIN AG, a company focusing on speech recognition and machine translation. His partners were, among others, Franz Josef Och, who eventually started the Google Translate team, and Stephan Kanthak, now Group Manager with Nuance Communications, and Stefan Ortmanns, today Senior Vice President, Mobile Engineering and Professional Services with Nuance Communications. Hassan also spent time at SAIC as Chief Scientist for Human Language Technology, where he worked on multilingual spoken dialogue systems. Coincidentally, his peer from Raytheon BBN Technologies was Rohit Prasad, who is now VP and Head Scientist for Amazon Alexa.

How did you get started?

“I started working in development on information systems in airports, believe it or not. Between airlines and airports, and from airport-to-airport, the communication used to be via Telex messages, using something similar to “shorthand” information about the plane. These messages included information such as Who has boarded the plane? What’s the cargo? How is the baggage distributed on the plane? How much fuel does it have? What kinds of passengers (first class, business class), etc. This kind of information was sent from airline to airport before the plane landed. But by the 1990’s, flight travel had grown exponentially. And it used to be that humans had to read this information and translate that into actions in the airport. So, we built the technology that could do this fully automatically, so that manual human intervention was no longer needed. People no longer needed to sit there reading Telex messages and typing ahead on the computer. We converted this such that the process was completely done by machine. This was my first project in natural language understanding.

(more…)

Using Amazon Rekognition to Identify Persons of Interest for Law Enforcement

by Chris Adzima | on | Permalink | Comments |  Share

This is a guest post by Chris Adzima, a Senior Information Systems Analyst for the Washington County Sheriff’s Office. 

In law enforcement, it is extremely important to identify persons of interest quickly. In most cases, this is accomplished by showing a picture of the person to multiple law enforcement officers in hopes that someone knows the person. In Washington County, Oregon, there are nearly 20,000 different bookings (when a person is processed into the jail) every year. As time passes, officers’ memories of individual bookings fade. Also, in most cases, investigations move very quickly. Waiting for an officer to come on duty to identify a picture might mean missing the opportunity to solve the case.

In this post, I discuss our decision to use AWS for facial recognition. I walk through setting up web and mobile applications using AWS, demonstrating how easy it is even for someone who is new to AWS. I then show how we used Amazon Rekognition to build a powerful tool for solving crimes.

The following diagram shows the system architecture:

Setup

When we were presented with the problem of quickly identifying persons of interest, we thought it seemed like something we could automate instead of resorting to the usual manual processes. We wanted to be able to not only get responses back to the officers within seconds, but also to ensure that officers’ memory wasn’t going to be a limiting factor.

This is where we turned to AWS and Amazon Rekognition. We had not used AWS, but we had read a release announcement about Amazon Rekognition a few days prior to being approached about fixing the identification process. We thought this would be a great product to test.

(more…)

Activity Tracking with a Voice-Enabled Bot on AWS

by Bob Strahan, Oliver Atoa and Bob Potterveld | on | Permalink | Comments |  Share
Listen to this post

Voiced by Amazon Polly

It’s New Year’s Eve. Your friends and loved ones have gone to the party, but you can’t go just yet because you haven’t figured out how to track the key performance indicators for your New Year’s resolution.

You’ve already divided your resolution into categories, and you’ve set personal targets for each category. Now you just need to log your activities and routinely calculate how you’re doing, so you stay on track. But you’ve been down this road before. You know that in a few short days log keeping will become tedious. You’ll start putting it off, and then you’ll forget. Before you know it, your resolution has gone the way of so many resolutions before it.

We’ve all been there. Right?

This year, you want a new way to log your activities. A fun and easy way, that will keep you engaged and prevent the procrastination that has so often proved disastrous. The midnight deadline is approaching – you need to implement this thing quickly so you can get to the party and celebrate the arrival of the New Year, secure in the knowledge that this year will be different!

In this post, we provide a solution to this perennial problem: a sample tracking bot application, called TrackingBot, which lets you log your activities by talking to it. The following flowchart shows the steps required to develop and use TrackingBot:

(more…)

Capturing Voice Input in a Browser and Sending it to Amazon Lex

by Andrew Lafranchise | on | Permalink | Comments |  Share

Ever since we released Amazon Lex, customers have asked us how to embed voice into a web application. In this blog post, we show how to build a simple web application that uses the AWS SDK for JavaScript to do that. The example application, which users can access from a browser, records audio, sends the audio to Amazon Lex, and plays the response. Using browser APIs and JavaScript we show how to request access to a microphone, record audio, downsample the audio, and PCM encode the audio as a WAV file. As a bonus, we show how to implement silence detection and audio visualization, which are essential to building a user-friendly audio control.

Prerequisites

This post assumes you have some familiarity with

Don’t want to scroll through the details? You can download the example application here: https://github.com/awslabs/aws-lex-browser-audio-capture

The following sections describe how to accomplish important pieces of the audio capture process. You don’t need to copy/paste them–they are intended as a reference. You can see everything working together in the example application.

Requesting access to a microphone with the MediaDevices API

To capture audio content in a browser, you need to request access to an audio device, in this case, the microphone. To access the microphone, you use the navigator.mediaDevices.getUserMedia method in the MediaDevices API. To process the audio stream, you use the AudioContext interface in the Web Audio API. The code that follows performs these tasks:

  1. Creates an AudioContext
  2. Calls the getUserMedia method and requests access to the microphone. The getUserMedia method is supported in Chrome, Firefox, Edge, and Opera. We tested the example code in Chrome and Firefox.
  3. Creates a media stream source and a Recorderobject. More about the Recorder object later.
  // control.js
 
  /**
   * Audio recorder object. Handles setting up the audio context, 
   * accessing the mike, and creating the Recorder object.
   */
  lexaudio.audioRecorder = function() {
    /**
     * Creates an audio context and calls getUserMedia to request the mic (audio).
     * If the user denies access to the microphone, the returned Promise rejected 
     * with a PermissionDeniedError
     * @returns {Promise} 
     */
    var requestDevice = function() {
 
      if (typeof audio_context === 'undefined') {
        window.AudioContext = window.AudioContext || window.webkitAudioContext;
        audio_context = new AudioContext();
      }
 
      return navigator.mediaDevices.getUserMedia({ audio: true })
        .then(function(stream) {
          audio_stream = stream; 
        });
    };
 
    var createRecorder = function() {
      return recorder(audio_context.createMediaStreamSource(audio_stream, worker));
    };
 
    return {
      requestDevice: requestDevice,
      createRecorder: createRecorder
    };
 
  };

The code snippet illustrates the following important points:

  • The user has to grant us access the microphone. Most browsers request this with a pop-up. If the user denies access to the microphone, the returned Promise rejected with a PermissionDeniedError.
  • In most cases, you need only one AudioContext instance. Browsers set limits on the number of AudioContextinstances you can create and throw exceptions if you exceed them.
  • We use a few elements and APIs (audio element, createObjectURL, and AudioContext) that require thorough feature detection in a production environment.

(more…)

Updated AWS Deep Learning AMIs with Apache MXNet 0.10 and TensorFlow 1.1 Now Available

by Victoria Kouyoumjian | on | Permalink | Comments |  Share

You can now use Apache MXNet v0.10 and TensorFlow v1.1 with the AWS Deep Learning AMIs for Amazon Linux and Ubuntu. Apache MXNet announced version 0.10, available at http://mxnet.io, with significant improvements to documentation and tutorials including updated installation guides for running MXNet on various operating systems and environments, such as NVIDIA’s Jetson TX2. In addition, current tutorials have been augmented with definitions for basic concepts around foundational development components. API documentation is now more comprehensive, with accompanying samples. Python PIP install packages are now available for the v0.10 release, making it easy to install MXNet on Mac OSX or Linux CPU or GPU environments. These packages also include Intel’s Math Kernel Library (MKL) support for acceleration of math routines on Intel CPUs.

Visit the AWS Marketplace to get started with the AWS Deep Learning AMI v1.4_Jun2017 for Ubuntu and the AWS Deep Learning AMI v2.2_Jun2017 for Amazon Linux. The AWS Deep Learning AMIs are available in the following public AWS regions: US East (N. Virginia), US West (Oregon), and EU (Ireland).

Tuning Your DBMS Automatically with Machine Learning

by Dana Van Aken, Geoff Gordon and Andy Pavlo | on | Permalink | Comments |  Share

This is a guest post by Dana Van Aken, Andy Pavlo, and Geoff Gordon of Carnegie Mellon University. This project demonstrates how academic researchers can leverage our AWS Cloud Credits for Research Program to support their scientific breakthroughs.

Database management systems (DBMSs) are the most important component of any data-intensive application. They can handle large amounts of data and complex workloads. But they’re difficult to manage because they have hundreds of configuration “knobs” that control factors such as the amount of memory to use for caches and how often to write data to storage. Organizations often hire experts to help with tuning activities, but experts are prohibitively expensive for many.

OtterTune, a new tool that’s being developed by students and researchers in the Carnegie Mellon Database Group, can automatically find good settings for a DBMS’s configuration knobs. The goal is to make it easier for anyone to deploy a DBMS, even those without any expertise in database administration.

OtterTune differs from other DBMS configuration tools because it leverages knowledge gained from tuning previous DBMS deployments to tune new ones. This significantly reduces the amount of time and resources needed to tune a new DBMS deployment. To do this, OtterTune maintains a repository of tuning data collected from previous tuning sessions. It uses this data to build machine learning (ML) models that capture how the DBMS responds to different configurations. OtterTune uses these models to guide experimentation for new applications, recommending settings that improve a target objective (for example, reducing latency or improving throughput).

In this post, we discuss each of the components in OtterTune’s ML pipeline, and show how they interact with each other to tune a DBMS’s configuration. Then, we evaluate OtterTune’s tuning efficacy on MySQL and Postgres by comparing the performance of its best configuration with configurations selected by database administrators (DBAs) and other automatic tuning tools.

OtterTune is an open source tool that was developed by students and researchers in the Carnegie Mellon Database Research Group. All code is available on GitHub, and is licensed under Apache License 2.0.

How OtterTune works

The following diagram shows the OtterTune components and workflow.

At the start of a new tuning session, the user tells OtterTune which target objective to optimize (for example, latency or throughput). The client-side controller connects to the target DBMS and collects its Amazon EC2 instance type and current configuration.

Then, the controller starts its first observation period, during which it observes the DBMS and records the target objective. When the observation period ends, the controller collects internal metrics from the DBMS, like MySQL’s counters for pages read from disk and pages written to disk. The controller returns both the target objective and the internal metrics to the tuning manager.

(more…)

In the Research Spotlight: Edo Liberty

by Victoria Kouyoumjian | on | Permalink | Comments |  Share

As AWS continues to support the Artificial Intelligence (AI) community with contributions to Apache MXNet and the release of Amazon Lex, Amazon Polly, and Amazon Rekognition managed services, we are also expanding our team of AI experts, who have one primary mission: To lower the barrier to AI for all AWS developers, making AI more accessible and easy to use. As Swami Sivasubramanian, VP of Machine Learning at AWS, succinctly stated, “We want to democratize AI.”

In our Research Spotlight series, I spend some time with these AI team members for in-depth conversations about their experiences and get a peek into what they’re working on at AWS.


Edo Liberty is a Principal Scientist at Amazon Web Services (AWS) and the manager of the Algorithms Group at Amazon AI. His work has received more than 1000 citations since 2012. You can view his conference and journal publications, patents, and manuscripts in process on www.edoliberty.com, or access his papers on Google Scholar.

Although fully immersed in AI today, Edo shared with me that he originally wanted to be a physicist when he started college in Tel Aviv. “I knew absolutely nothing about computers and I felt I would be a lousy physicist if I didn’t learn to code, at least a little bit.” So, he minored in computer science, “even though I knew I was going to hate it,” he admitted. “But the more I learned, physics became more numerical, technical, and counterintuitive as relativity and quantum mechanics kicked in. At the same time, computer science became less technical and more abstract and beautiful. Computer science stopped being about software, and started being about algorithms and math and complexity and that’s when it became very interesting. I ended up shifting to a major in computer science, and really fell in love with this whole field.”

In 2004, Edo moved to the United States, and completed his PhD in computer science at Yale University, where he started getting into a lot of different types of machine learning, data science, and data mining. He ended up most interested in math and algorithms – specifically, the theory and the algorithms behind big data. “Back then, we were working on hyperspectral images. Every image was about 1½ GB, but my desktop only had 512 Mb of memory. That was big data for me!  But I still needed to analyze the image, so I had to really figure out what to do.”

Edo finished his PhD doing theoretical computer science, and then completed a post-doctorate in applied math. He then opened a startup in New York City, building a distributed video search platform “with lots of algorithms and systems and math, and it was very exciting and a lot of fun.”

(more…)