Using machine learning to stay connected
This post describes winning solutions from the AWS Marketplace Machine Learning Challenge hackathon. Other winners created solutions using machine learning to automate tasks and increase personalization and to support healthcare during a pandemic.
The AWS Marketplace Developer Challenge: ML-Powered Solutions hackathon was hosted by the AWS Marketplace Machine Learning (ML) team earlier this year. A hackathon is a design sprint-like event in which software developers collaborate with the goal of creating functioning software. The hackathon had over 1,100 participants collaborating on 57 projects. Several participants submitted projects designed to use ML models to develop applications to help people stay connected with your near and dear ones. This blog post highlights two winning solutions that demonstrate use of ML to connect with friends for karaoke and to improve the online self-learning experience. See the submissions received for the AWS Marketplace Developer Challenge: ML-Powered solutions challenge here.
Participants used at least one ML model from AWS Marketplace deployed on Amazon SageMaker and at least one AWS service that is not Amazon SageMaker.
- AWS Marketplace a curated digital catalog of listings from independent software vendors that enable you to find, test, buy, and deploy software that runs on AWS. AWS Marketplace lists pre-trained ML models and algorithms that can be directly deployed on Amazon SageMaker. For more information on how AWS Marketplace supports ML workloads, see Using AWS Marketplace for machine learning workloads.
- Amazon SageMaker is a fully managed service that provides every developer and data scientist with the ability to build, train, and deploy ML models quickly. Amazon SageMaker removes the heavy lifting from each step of the ML process to make it easier to develop high-quality models.
In this post, we give an overview of two winning ML solutions:
- Smart Teacher – A Chrome browser extension that generates a quiz using Natural Language Processing (NLP) and ML models.
- MusicBucket – A social connectivity app that lets you sing karaoke with your friends and family.
Second prize winner: Smart Teacher
Smart Teacher is a web browser extension that generates a quiz and smart notes on YouTube and other educational technology (EdTech) video websites. It does this using Natural Language Processing (NLP) and ML. It lets users self-evaluate their understanding of the content by presenting them with a quiz based on video content. The extension also lets the user move back to the part of the video relevant to the question. This interactive extension helps users learn by testing their knowledge while they learn. The following screenshot shows an educational YouTube video and the Smart Teacher (originally named Smart Quizzer) extension being activated. Refer to the following screenshot.
The COVID-19 pandemic has impacted education. Millions of students are now learning by watching videos. Although there is no scarcity of videos to help students learn, human attention spans are limited. Watching long lectures can lead to the viewers “zoning out” and having to re-watch the video to understand parts they missed. The Smart Teacher extension helps solve the problem by generating and presenting a comprehension quiz to the viewer.
AWS services used
- Amazon Route 53 is a highly available and scalable cloud Domain Name System (DNS)web service.
- Amazon CloudFront is a content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to customers globally.
- Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of operating systems, load them with your custom application environment, manage your network’s access permissions, and run your image using as many or few systems as you desire.
- Elastic Load Balancing automatically distributes incoming application traffic across multiple targets, such as Amazon EC2 instances, containers, IP addresses, and Lambda functions.
- Amazon Simple Storage Service (Amazon S3) helps you store and protect any amount of data for a range of use cases.
- Amazon Transcribe makes it easy for developers to add speech-to-text capability to their applications.
- Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights and relationships in text. No machine learning experience is required to use it.
- Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale.
The Smart Teacher extension requires users to install the extension in their browser. Once installed, The Smart Teacher extension does the following:
- Makes a backend request via Route 53 to fetch the smart quiz in addition to notes for the video.
- The request gets routed to a CloudFront CDN, which returns resources found in its cache in addition to static resources from S3.
- CloudFront also invokes Elastic Load Balancing for serving the dynamic request, that is, the quiz itself.
- Elastic Load Balancing sends the request to the EC2 instance, which is running in an auto-scaled manner.
- The EC2 instance first checks whether a record exists for the requested video ID in the DynamoDB table. If the quiz and smart notes exist from a previous request, it returns them. If the video ID does not exist, it performs the following steps to generate smart notes and a quiz:
- Utilizes Transcribe to convert the audio to text.
- It invokes AWS Marketplace: Mphasis DeepInsights Text Summarizer ML model to summarize the text.
- It invokes Amazon Comprehend next to identify important concepts in the text.
- It also stores the information in DynamoDB for future lookups.
See the solution demo, architecture diagram, and more information about Smart Teacher project here.
Honorable mention prize winner: MusicBucket
Submitter: Evan Li
MusicBucket is an ML-powered application that uses serverless managed services on the backend. You can use MusicBucket to challenge your friends, family, or even strangers to participate in a karaoke room or a music sing off.
It’s challenging to safely socialize during a pandemic. To stay safe and slow down the spread of COVID-19, we must stay home, avoid crowds, and practice social distancing. When everyone is staying at home, however, it can get boring. The isolation can be challenging and even cause anxiety. MusicBucket was created to help everyone socialize and enjoy singing with friends and family while maintaining physical distance.
How does MusicBucket work?
MusicBucket is a mobile application that uses Quantiphi Source Separation model from AWS Marketplace. The ML model separates vocals from background music. With MusicBucket, you can also upload your own song. Users can create a room in app, invite friends or family and users can sing along in karaoke-style without the vocals. Music students, teachers, or anyone who wants to improve their singing skills can also use it for solo practice. If you’re shy about singing karaoke, you can enter a private or solo room and listen to the isolated song vocals.
To use MusicBucket for a karaoke social, invite your friends to a karaoke room. They can select a song or upload their own song and start singing, as shown in the following screenshot. Users take turns singing songs with the background music. Refer to the following screenshots of a room with four users and song selection.
AWS Services used
- Amazon Cognito is a service that lets you add user sign up, sign in, and access control to your web and mobile apps.
- Amazon API Gateway is a fully managed service that lets developers to create, publish, maintain, monitor, and secure APIs at any scale.
- Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud.
- Amazon DynamoDB is a fully managed key-value and document database that delivers single-digit millisecond performance at any scale.
- AWS Lambda is a serverless compute service that runs your code in response to events and automatically manages the underlying compute resources for you.
- Amazon Transcribe enables speech to text conversion using Automatic Speech Recognition (ASR) technology.
- Amazon Simple Notification Service (SNS) is a highly available, durable, secure, fully managed pub/sub messaging service that enables you to decouple microservices, distributed systems, and serverless applications.
- Amazon Simple Storage Service (Amazon S3) offers a durable, highly available, and scalable data storage infrastructure at low costs.
- Source Separation, a pre-trained ML model available in AWS Marketplace that separates song vocals from the background music.
- Barcode/QR-code Scanner, available in AWS Marketplace, scans images and returns barcode data.
- The users authenticate into the application with Amazon Cognito service.
- The application uses Twilio WebRTC to create the karaoke room backed by voice calling functionality.
- The authenticated user invites friends by making a request through API Gateway (RestAPI) to a Lambda function.
- The Lambda function generates tokens, and tokens are sent to the invited users via a notification which they can click on to join the room.
- Amazon DynamoDB is used to store session metadata.
Refer to the following diagram.
- Authenticated users upload an audio file to the Amazon S3 bucket.
- A Lambda function is automatically triggered to perform following actions:
- Invoke Amazon SageMaker endpoint that serves Quantiphi – Source Separation model.
- Links to the response returned (vocals and background) is then stored in Amazon RDS.
Refer to the following diagram.
- A Lambda function then grabs the WebSocket API connection IDs from Amazon DynamoDB. It returns JSON code through the WebSocket API connection in the API Gateway back to the user.
- Lyrics are created by integrating Amazon SageMaker with Amazon Transcribe. The vocal track returned by Quantiphi – Source Separation model is forwarded into Amazon Transcribe, and the output of the song lyrics appear in the app for easy singalong. Refer to the following diagram.
In this post, we provided an overview of two applications built as part of AWS Marketplace Developer Challenge hackathon. Solutions like these use machine learning to help you to stay connected with friends and family, even during a pandemic.
Participants in the AWS Marketplace Developer Challenge accepted the challenge to develop solutions using AWS services and AWS Marketplace ML models to help solve real-world problems. Congratulations to these winners, in addition to all who submitted.
To find out more about these solutions in addition to what’s next for submitters, visit their hackathon submission pages. You can explore AWS Marketplace machine learning models here, and find out more about how to deploy them on Amazon SageMaker.
About the authors
Pranusha Manchala is a Solutions Architect at AWS who works with Education companies. She worked with many EdTech customers and provided them with architectural guidance for building highly scalable and cost optimized applications on AWS. She found her interests in Machine learning and started to dive deep into this technology. She enjoys cooking, baking and outdoor activities in her free time.
Kanchan Waikar is a Senior Partner Solutions Architect at Amazon Web Services with AWS Marketplace for machine learning group. She has over 13 years of experience building, architecting, and managing, NLP, and software development projects. She has a masters degree in computer science(data science major) and she enjoys helping customers build solutions backed by AI/ML based AWS services and partner solutions.