AWS News Blog

Pollexy – Building a Special Needs Voice Assistant with Amazon Polly and Raspberry Pi

April is Autism Awareness month and about 1 in 68 children in the U.S. have been identified with autism spectrum disorder (ASD) (CDC 2014). In this post from Troy Larson, a Sr. Devops Cloud Architect here at AWS, you get an introduction to a project he has been working on to help his son Calvin.

I have been asked how the minds at AWS come up with so many different ideas. Sometimes they come from a deeply personal place, where someone sees a way to help others. Pollexy is an amazing example of just that. Read about Pollexy and then watch the video here.

-Ana


Background

As a computer programming parent of a 16-year old non-verbal teenage boy with autism, I have been constantly searching over the years to find ways to use technology to make our lives together safer, happier and more comfortable. At the core of this challenge is the most basic of all human interaction—communication. While Calvin is able to respond to verbal instruction, he is not able to speak responsively. In his entire life, we’ve never had a conversation. He is able to be left alone in his room to play, but most every task or set of tasks requires a human to verbally prompt him along the way. Having other children and responsibilities in the home, at times the intensity of supervision can be negatively impactful on the home dynamic.

Genesis

When I saw the announcement of Amazon Polly and Amazon Lex at re:Invent last year, I immediately started churning on how we could leverage these technologies to assist Calvin. He responds well to human verbal prompts, but would he understand a digital voice? So one Saturday, I setup a Raspberry Pi in his room and closed his door and crouched around the corner with other family members so Calvin couldn’t see us. I connected to the Raspberry Pi and instructed Polly to speak in Joanna’s familiar pacific tone, “Calvin, it’s time to take a potty break. Go out of your bedroom and go to the bathroom.” In a few seconds, we heard his doorknob turn and I poked my head out of my hiding place. Calvin passed by, looking at me quizzically, then went into the bathroom as Joanna had instructed. We all looked at each other in amazement—he had listened and responded perfectly to the completely invisible voice of someone he’d never heard before. After discussing some ideas around this with co-workers, a colleague suggested I enter the IoT and AI Science Fair at our annual AWS Sales Kick-Off meeting. Less than two months after the Polly and Lex announcement and 3500 lines of code later, Pollexy—along with Calvin–debuted at the Science Fair.

Overview

Pollexy (“Polly” + “Lex”) is a Raspberry Pi and mobile-based special needs verbal assistant that lets caretakers schedule audio task prompts and messages both on a recurring schedule and/or on-demand. Caretakers can schedule regular medicine reminder messages or hourly bathroom break messages, for example, and at the same time use their Amazon Echo and mobile device to request a specific message be played immediately. Caretakers can even set it up so that the person needs to confirm that they’ve heard the message. For example, my son won’t pay attention to Pollexy unless Pollexy first asks him to “Push the blue button.” Pollexy will wait until he has pushed the button and then speak the actual message. Other people may be able to respond verbally using Lex, or not require a confirmation at all. Pollexy can be tailored to what works best.

And then most importantly—and most challenging—in a large house, how do we make sure the person is in the room where we play the message? What if we have a special needs adult living in an in-law suite? Are they in the living room or the kitchen? And what about multiple people? What if we have multiple people in different areas of the house, each of whom has a message? Let’s explore the basic elements and tie the pieces together.

Basic Elements of Pollexy

In the spirit of Amazon’s Leadership Principle “Invent and Simplify,” we want to minimize the complexity of the Pollexy architecture. We can break Pollexy down into three types of objects and three components, all of which work together in a way that’s easily explainable.

Object #1: Person

Pollexy can support any number of people. A person is a uniquely identifiable name. We can set basic preferences such as “requires confirmation” and most importantly, we can define a location schedule. This means that we can create an Outlook-like schedule that sets preferences where someone should be in the house.

Object #2: Location

A location is simply a uniquely identifiable location where a device is physically sitting. Based on the user’s location schedule, Pollexy will know which device to contact first, second, third, etc. We can also “mute” devices if needed (naptime, etc.)

Object #3: Message

Obviously, this is the actual message we want to play. Attached to each message is a person and a recurring schedule (only if it’s not a one-time message). We don’t store location with the message, because Pollexy figures out the person’s location when the message is ready to be delivered.

Component #1: Scheduler

Every message needs to be scheduled. This is a command-line tool where you basically say Tell “Calvin” that “you need to brush your teeth” every night at 8 p.m. This message is then stored in DynamoDB, waiting to be picked up by the queueing Lambda function.

Component #2: Queueing Engine

Every minute, a Lambda runs and checks the scheduler to see if there is a message or messages ready to be delivered. If a message is ready, it looks up the person’s location schedule and figures out where they are and then pushes the message or messages into an SQS queue for that location.

Component #3: Speaker Engine

Every minute on the Raspberry Pi device, the speaker engine spins up and checks the SQS for its location. If there are messages, then the speaker engine looks at the user’s preferences and initiates communication to convey the message. If the person doesn’t respond, the speaker engine will check if the person has a secondary location in their schedule and drop the message in the SQS Queue for that location. In the end, a message will either be delivered or eventually just timeout (if someone is out of the house for the day).

Respect and Freedom are the Keys

We often take our personal privacy and respect for granted, so imagine even for a special needs person, the lack of privacy and freedom around having a person constantly in your presence. This is exaggerated for those in the autism spectrum where invasion of personal space can escalate a sense of invasion, turning into anger and frustration. Pollexy becomes their own personal, gentle and never-flustered friend to coach to them along the way, giving them confidence, respect and the sense of privacy and freedom we all want to enjoy.

-Troy Larson

TAGS:
Ana Visneski

Ana Visneski

Ana Visneski is the Principal Technical PM for AWS Disaster Response. You can follow her on Twitter: @acvisneski