Doorman

Inspiration

Within our company we have a lot of people working remotely, and in the offices (multiple locations). Sometimes it's very hard to find out if someone is active / available or not, or where they are physically. Sometimes people forget to put their status to active as well. We also deal with multiple regions and timezones.

We also use the Slack personal status message to state where you are (working from home, in the office, and if so, which location).

In comes the DeepLens. Wouldn't it be awesome that whenever you come into the office, you get recognized and marked as 'active' in slack, and your status is set to whichever office you are in?

What it does

  • It detects whenever a person comes into one of the offices
  • It tries to detect the person
  • If it identifies the person, it will say a welcome message on slack "Welcome @svdgraaf!"
  • If it doesn't identify the person, it will ask on slack "Who is this?" And you can select which user it is.
  • If you don't want this person to be recognized (eg: clients or non-slack users), you can select 'ignore', and it will remove the image from S3.
  • It then stores this information in the Rekognition api
    Next time it identifies the person as such and notes it on the slack channel

What it doesn't do yet:

  • Mark person as active on Slack
  • Set the person status on Slack
  • Speak out loud with Polly, haven't gotten it to work on the device yet

Created By: Sander van de Graaf

How I built it

I built this using the Serverless, which makes deploying the lambda functions really easy. I tried to build my own model, but failed (and I was concerned about costs for the training of the ec2 instances).. In the end, I used the 'deeplens-object-detection' model, no custom model, and check for 'person' objects.

How it works: The deeplens detects objects coming by. If it detects a person, it takes a snapshot of that part, and uploads it to S3. This triggers a 'guess' function, which tries to detect the person. If it's a known perso, it calls the Slack api. If it's an unknown person, it also calls Slack. Within the channel, people can train the unknown person to a specific user, and it will call the Api Gateway to train the image.

The architecture diagram can be found in the image carousel above.

Challenges

Greengrass: Acouple of times I had greengrass fail on me, so it was not deploying or running my lambda function.

Deeplens: My deeplens was working at one point, and it didn't the next day. The ubuntu auto update feature is neat, until it breaks :) Also: it took me a while to figure out you can run multiple lambda functions at the same time on the Deeplens. Mind blown!
 
Logging: It's really hard to debug your lambda functions on the device when they don't work. I ended up finding a way to push the logs into cloudwatch, which is neat, but it was not very well documented.

Privacy issues (setup timing and control): A coworker asked if he/she could be excluded from the test. I understand the implications. Fortunately, this was just a POC and for fun, but it did spark a good conversation and discussion about Privacy concerns with these kinds of technologies. We ended up setting a timer so that the detection is only done between 08.00 and 10.00.

Slack API
(timeout is very short): I learned the hard way that you have to reply in the Slack api within a couple milliseconds, or it does not work. I ended up calling another call asynchronously to get everything working correctly.

Polly: I have the generating of mp3's ready, I can play these locally just fine. I haven't figured out yet how to play these on the DeepLens itself.

Accomplishments that I'm proud of

It works! It's really cool to see all the lambda functions trigger each other and do their thing, without running any servers. Really awesome stuff! 

What I learned

  • Greengrass
  • Slack API
  • OpenCV
  • Polly

What's next

Make S3 upload async: Currently the S3 upload is blocking, which on itself is fine, as it won't upload a gazillion images of the same person. But it does lock up the stream momentarily. I envision just writing the files to a local (tmp) folder somewhere, and have another lambda function upload those, or even better: detect faces on the device directly.

Don't trigger multiple times: Whenever a person walks by the camera slowly, or walks back within a minute, you'll get multiple postings to Slack. I want to add a feature where it's storing the last datetime a person was detected in Dynamo, and whenever we see that person again within X time, don't post to slack.

Do face detection on device: Would be really nice if I have the time to make this work directly on the device.

Make Slack app public: Currently you need to setup your own "dev" slack app to get the integration to work. It would be nice if I can publish my app, and I can just point to the right App.

Add Amazon Polly: I have been working on the Polly api, I have the generating of the mp3's working, but I currently haven't found a way to play the mp3 on the deeplens. I have some speakers available ;)

Mark a person as 'active': Currently, it's not allowed to mark a person as active through the slack api without getting explicit permission from the user itself. Seeing as I didn't want to intrude my fellow co-workers, this hasn't been added yet.

Mark a persons status: Same as above, the end users need to approve of this individually. It would be nice if we could mark the current office location in their status.

Built with

python
deeplens
rekognition
serverless
polly

Try it out