AWS Architecture Blog
Discovering Hot Topics using Machine Learning
Successful businesses not only have great products and services; they also have a deep understanding of their customers. Companies that can use behavioral analytics in marketing automation platforms are better equipped to deliver real-time marketing efforts. According to a research case study from Deloitte, companies with a customer-centric business model are 60% more profitable. Knowing and adapting this differentiating business model is the key to becoming a market leader in today’s fierce competitive landscape. With a 3.8 billion user base and 86% of the users interacting daily, social media has become an impactful customer voice through platforms like Twitter, Facebook, and Reddit. It provides unfiltered customer expectations of your company’s products, services, and policies. Businesses can use this analysis and insight to react quickly to new growth opportunities, identify negative brand associations, and then be able to deliver an exceptional customer experience.
In this blog post, we’ll explore a pre-packaged solution, built and vetted by Amazon Web Services (AWS) called Discovering Hot Topics Using Machine Learning. This is an AWS Solutions implementation that ingests social media feeds to identify the most dominant topics associated with your products, services, events, and brands. The solution automates text and image ingestion from social media, to provide near real-time inferences using Machine Learning (ML) algorithms. It uses Amazon Comprehend, Amazon Translate and Amazon Rekognition.
Use Case
Here’s a fictional scenario to showcase the value of our solution. A media company launched a new TV series on controversial issues, like healthcare, global warming, data privacy, AI, among others. This media company deployed the Discovering Hot Topics Using Machine Learning solution to analyze the success of their series. This solution has a pre-built dashboard that will help them understand their viewers’ opinions and assist in making data-driven decisions on their upcoming content.
Analyzing Text
 
 
        Figure 1. Example Amazon QuickSight dashboard for Topic Analysis
Social media conversations are impactful customer voices. But to derive signal from noise poses a significant challenge, considering the volume of data. To address this, our solution uses Topic Modeling to identify groups of phrases that together make a conversation. Figure 1, widget 1, lists such dominant conversations identified on a daily basis from the ingested feed. Selecting a topic from that list (for example, ‘000’) filters the data on widgets 5, 6, 7, and 8 for the selected topic. Widget 5 is a word cloud of phrases corresponding to the selected topic.
The Heat Map (Figure 1, widget 7), provides another dimension to these topics. It shows the intensity of topics over time. Analysts can analyze spikes in intensity, similar to the one in the top left cell of widget 7. Selecting that cell in the chart also filters widgets 1, 5, 6, and 8 to get to root of the conversations. Widget 6 shows sentiment associated with the selected topic. Selecting a specific sentiment (for example ‘Negative’) provides a next-level closer look into related conversations for the selected topic, as well as insights on the voice of the viewer.
To analyze trend reversals, spikes, and gain insights on ‘works well’, and ‘does not work,’ this media company can use several widgets. Figure 1, widget 2 (overall sentiment), widget 3 (near-term sentiment), and widget 4 (long-term sentiment), will help the company uncover new opportunities. The near real-time nature of insights allows the company to quickly remediate any negative fallout.
 
 
        Figure 2. Example Amazon QuickSight dashboard for Text Analysis
Using the Text Analysis tab (Figure 2), widgets 1 and 2 aggregate entities and key phrases respectively, with the sentiment of tweets in which they occur. Blending sentiment with entities and phrases provides insights on atypical correlations. Clicking on the colored grouping of sentiments in widgets 1 or 2 will filter the table of tweets in widget 4. This will help identify atypical scenarios. Though widget 4 shows tweet attributes (likes, retweets, and quote tweets), clicking on individual rows in widget 4 opens the tweet on Twitter.com for real-time attribute values.
Figure 2, widget 3, can filter data to specific geographies (when GeoCoordinates are available). With this, the global media company can add spatial analysis to understand how the new series is resonating across viewers in different geographies.
Analyzing Image
 
 
        Figure 3. Example Amazon QuickSight dashboard for Image Analysis
A picture is worth a thousand words. So in addition to text, the solution adopts a multi-model approach to analyze memes, images, and animated gifs, for embedded text and inappropriate or offensive imagery.
The embedded text undergoes Natural Language Processing (NLP) to detect entities (Figure 3, widget 1) and phrases (Figure 3, widget 2). Analyzing these entities and phrases gives this media company even deeper insights on customer sentiments.
Using widgets 4, 5, media companies can identify moderation labels and pro-actively remediate unwanted associations.
Let’s unpack the Discovering Hot Topics Using Machine Learning solution.
Solution Highlights
The solution uses AWS CloudFormation to automate deployment of solution architecture into AWS Cloud, as shown in Figure 4.
 
 
        Figure 4. Discovering Hot Topics using Machine Learning solution architecture
This architecture can be segmented into three phases:
- Ingestion: Social media as a feed ingestion source, automated using AWS Lambda and Amazon CloudWatch Events, buffered through Amazon Kinesis Data Streams.
- Inference: A workflow engine using AWS Step Functions orchestrates AWS AI services. Amazon Translate and Amazon Comprehend, translate and analyze ingested data. Amazon Rekognition detects moderation labels on images in the tweets. The resulting inference data is stored in Amazon Simple Storage Service (S3) using Amazon Kinesis Data Firehose.
- Visualization: For visualization, deployment comes with a pre-built Amazon QuickSight dashboard. This dashboard renders inference data across topic, text, image, and geography analysis tabs. Example use cases for QuickSight dashboards can be found in our Implementation Guide.
Although Twitter is the solution’s default ingestion source, customers can use other API-based platform feeds like YouTube comments, Facebook, Instagram, and even internal enterprise datastores as the source. Inference phase stows inference data onto Amazon S3, so customers can enrich their BI dashboards to inspect high-quality, filtered, social media signals.
Use Case Implementation
Discovering Hot Topics Using Machine Learning is a ready to deploy, one-click-launch solution that businesses can use to listen to voice of your customer. It has a pre-packaged QuickSight dashboard and is deployed as a CloudFormation template. An updatable QueryParameter on the CloudFormation template customizes the deployment to pull related Twitter feeds (see how to build a search string for twitter API.)
After successful deployment, the solution will start ingesting tweets via Amazon Kinesis Data Streams. This feed is fed into a workflow engine where the unstructured data feed is pre-processed and synthesized by AWS AI services for inferences. In about ten minutes after deployment, ML inferences (including text and image inferences) begin flowing into your QuickSight dashboard.
Although this blog presents a Media and Entertainment vertical use case, this solution can bring quick results to any industry segment. This solution reduces undifferentiated heavy-lifting by handling the unstructured nature of social media data. It addresses infrastructure scaling and performance concerns that come with ever increasing volumes of data feeds. The serverless nature of our AWS Solution makes this deployment cost-effective, customers only pay for what they use. Cost estimate details can be found in our Implementation Guide.
Conclusion
In this post, we explored the Discovering Hot Topics Using Machine Learning solution. This is a readily available option, which brings instant customer insights and trends from social media feeds to a business dashboard. We saw how a media company can swiftly identify and react to new growth opportunities and at the same time deliver higher levels of customer satisfaction.
Visit our AWS Solutions webpage for more details. Check out Solving with AWS Solutions: Discovering Hot Topics Using Machine Learning on YouTube.
