AWS Machine Learning Blog
How Searchmetrics uses Amazon SageMaker to automatically find relevant keywords and make their human analysts 20% faster
Searchmetrics is a global provider of search data, software, and consulting solutions, helping customers turn search data into unique business insights. To date, Searchmetrics has helped more than 1,000 companies such as McKinsey & Company, Lowe’s, and AXA find an advantage in the hyper-competitive search landscape.
In 2021, Searchmetrics turned to AWS to help with artificial intelligence (AI) usage to further improve their search insights capabilities.
In this post, we share how Searchmetrics built an AI solution that increased the efficiency of its human workforce by 20% by automatically finding relevant search keywords for any given topic, using Amazon SageMaker and its native integration with Hugging Face.
Using AI to identify relevance from a list of keywords
A key part of Searchmetrics’ insights offering is its ability to identify the most relevant search keywords for a given topic or search intent.
To do this, Searchmetrics has a team of analysts assessing the potential relevance of certain keywords given a specific seed word. Analysts use an internal tool to review a keyword within a given topic and a generated list of potentially related keywords, and they must then select one or more related keywords that are relevant to that topic.
This manual filtering and selection process was time consuming and slowed down Searchmetrics’s ability to deliver insights to its customers.
To improve this process, Searchmetrics sought to build an AI solution that could use natural language processing (NLP) to understand the intent of a given search topic and automatically rank an unseen list of potential keywords by relevance.
Using SageMaker and Hugging Face to quickly build advanced NLP capabilities
To solve this, Searchmetrics’ engineering team turned to SageMaker, an end-to-end machine learning (ML) platform that helps developers and data scientists quickly and easily build, train, and deploy ML models.
SageMaker accelerates the deployment of ML workloads by simplifying the ML build process. It provides a broad set of ML capabilities on top of a fully managed infrastructure. This removes the undifferentiated heavy lifting that too-often hinders ML development.
Searchmetrics chose SageMaker because of the full range of capabilities it provided at every step of the ML development process:
- SageMaker notebooks enabled the Searchmetrics team to quickly spin up fully managed ML development environments, perform data preprocessing, and experiment with different approaches
- The batch transform capabilities in SageMaker enabled Searchmetrics to efficiently process its inference payloads in bulk, as well as easily integrate into its existing web service in production
Searchmetrics was also particularly interested in the native integration of SageMaker with Hugging Face, an exciting NLP startup that provides easy access to more than 7,000 pre-trained language models through its popular Tranformers library.
SageMaker provides a direct integration with Hugging Face through a dedicated Hugging Face estimator in the SageMaker SDK. This makes it easy to run Hugging Face models on the fully managed SageMaker infrastructure.
With this integration, Searchmetrics was able to test and experiment with a range of different models and approaches to find the best-performing approach to their use case.
The end solution uses a zero-shot classification pipeline to identify the most relevant keywords. Different pre-trained models and query strategies were evaluated, with facebook/bart-large-mnli providing the most promising results.
Using AWS to improve operational efficiency and find new innovation opportunities
With SageMaker and its native integration with Hugging Face, Searchmetrics was able to build, train, and deploy an NLP solution that could understand a given topic and accurately rank an unseen list of keywords based on their relevance. The toolset offered by SageMaker made it easier to experiment and deploy.
When integrated with Searchmetrics’s existing internal tool, this AI capability delivered an average reduction of 20% in the time taken for human analysts to complete their job. This resulted in higher throughput, improved user experience, and faster onboarding of new users.
This initial success has not only improved the operational performance of Searchmetrics’s search analysts, but has also helped Searchmetrics chart a clearer path to deploying more comprehensive automation solutions using AI in its business.
These exciting new innovation opportunities help Searchmetrics continue to improve their insights capabilities, and also help them ensure that customers continue to stay ahead in the hyper-competitive search landscape.
In addition, Hugging Face and AWS announced a partnership earlier in 2022 that makes it even easier to train Hugging Face models on SageMaker. This functionality is available through the development of Hugging Face AWS Deep Learning Containers (DLCs). These containers include Hugging Face Transformers, Tokenizers, and the Datasets library, which allows us to use these resources for training and inference jobs.
For a list of the available DLC images, see available Deep Learning Containers Images, which are maintained and regularly updated with security patches. You can find many examples of how to train Hugging Face models with these DLCs and the Hugging Face Python SDK in the following GitHub repo.
Learn more about how you can accelerate your ability to innovate with AI/ML by visiting Getting Started with Amazon SageMaker, getting hands-on learning content by reviewing the Amazon SageMaker developer resources, or visiting Hugging Face on Amazon SageMaker.
About the Author
Daniel Burke is the European lead for AI and ML in the Private Equity group at AWS. Daniel works directly with Private Equity funds and their portfolio companies, helping them accelerate their AI and ML adoption to improve innovation and increase enterprise value.