Guru Unlocks New Business Opportunities Using Amazon OpenSearch Service
Guru Technologies (Guru), a startup that provides knowledge management software, makes it simple for companies to access their internal information whenever needed, no matter where it’s stored. Fast, relevant query results have always been critical to Guru and its customer base, which includes companies such as Slack, Noom, Nubank, Zoom Video Communications, Shopify, and Spotify. But as Guru saw significant growth—particularly sizable year-over-year growth in monthly active users—it found that its self-managed Elasticsearch solution didn’t have the scalability, speed, or reliability the company needed to continue innovating at scale.
Having used Amazon Web Services (AWS) since Guru’s founding in 2014, Guru again looked to AWS for a solution. Using Amazon OpenSearch Service, a managed service that makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more
If we didn’t have [Amazon OpenSearch Service], Amazon EMR, and all those tools readily available for experimenting with algorithm iterations, we wouldn’t have had the bandwidth to even consider doing it.”
Chief Technology Officer and Cofounder,
Migrating to Fully Managed Elasticsearch
Looking to implement a cloud-based solution, Guru built its tech infrastructure on AWS from the beginning. The company knew that the cloud could fulfill its needs for storage, scaling, and elasticity whereas running its infrastructure at a colocation center would require significant effort for management and capacity expansion. “When we started on AWS, the goal was to have infrastructure as code so that we could spin up our environments automatically,” says Mitchell Stewart, chief technology officer and cofounder of Guru.
The company initially used AWS CloudFormation, which offers a simple way to model a collection of related AWS and third-party resources, provision them quickly and consistently, and manage them throughout their lifecycles by treating infrastructure as code. Guru also made use of Amazon Elastic Block Store (Amazon EBS), a simple-to-use, high-performance block-storage service designed for use on Amazon Elastic Compute Cloud (Amazon EC2) for both throughput and transaction-intensive workloads at any scale. “We began with a very simple architecture,” says Stewart. “Since then, we’ve continued adopting all these pieces of technology AWS has released over the past 7 years. Our architecture has gotten a lot more complex, but the principle is the same: AWS continues to provide fully managed services, solving a number of elastic and dynamic scaling problems so that we don’t have to solve them ourselves.”
For Guru, one such scaling problem centered on Elasticsearch. The company initially hosted its own Elasticsearch cluster, using Amazon EC2 for compute. “Elasticsearch is a core part of our product,” says Stewart. “We have focused a lot of resources and attention on it because we are actively looking to improve the overall search performance by providing low latency and relevant search results for our users.” The decision to migrate to Amazon OpenSearch Service was based on resources. “We asked ourselves, ‘Do we want to have dedicated employees worrying about our own Elasticsearch cluster,’” continues Stewart, “‘or would we rather have an Elasticsearch service provide expert management?’”
Accelerating Experimentation and Innovation
Guru initiated its migration to Amazon OpenSearch Service in the summer of 2020 and concluded a few months later. Within a short amount of time, the company saw several benefits from the migration. For one, the company was able to use Amazon EMR—an industry-leading cloud big data service for processing vast amounts of data using open-source tools—to develop an experimentation framework for improving the search result relevance of its search engine. This would ultimately help users find the information they’re looking for more quickly.
Using this framework, Guru can run many quick, useful tests. For example, the company can spin up a new Elasticsearch cluster with proposed algorithm changes and determine whether the search result relevancy of the new cluster is better or worse than that of the original production cluster. Guru was able to measure and compare search result relevancy in part because Amazon OpenSearch Service enables the company to log search queries in real time. “If we didn’t have [Amazon OpenSearch Service], Amazon EMR, and all those tools readily available for experimenting with algorithm iterations, we wouldn’t have had the bandwidth to even consider doing it,” says Stewart.
The experiments Guru had previously attempted took weeks or months. But after the migration to the AWS environment, the company could run experiments in hours or even minutes. “Every time we needed to run an experiment before, we would have a DevOps resource spend 5–6 hours just scaling it up so we could actually run the experiment,” says Nabin Mulepati, principal machine learning engineer at Guru. “Now we can just say, ‘Hey, give me 30 nodes,’ and in an hour, we have a cluster that’s ready to run experiments. And once we’re done, we can scale it down so that we don’t incur unnecessary costs.”
Between the completion of its migration in the fall of 2020 and early 2021, Guru ran experiments that involved replaying nearly half a billion queries. As a result of these experiments, the company saw a 10 percent improvement in search performance.
Even when Guru isn’t actively running experiments, the managed Amazon OpenSearch Service environment makes upgrades much simpler for the company. “In the past, we couldn’t take advantage of the new features Elasticsearch was pushing out, which meant we weren’t able to solve problems for our customers,” says Jeff Plater, principal engineer at Guru. “Now that we’ve migrated to [Amazon OpenSearch Service], we can stay up to date and get those features. Ultimately, that will enable us to more quickly improve the search service for our users.” With up to 1 million search requests a day, Guru can’t afford to slow down.
Opening the Door to Machine Learning
By migrating from self-managed Elasticsearch clusters to Amazon OpenSearch Service, Guru was able to spend more time focusing on experimentation and innovation. With this framework in place, Guru has a scalable path for experimenting with machine learning and deep learning, including implementing the k-nearest neighbors algorithm and learning to rank. The company also plans to start using Amazon SageMaker, which helps data scientists and developers prepare, build, train, and deploy high-quality machine learning models quickly by bringing together a broad set of capabilities purpose built for machine learning.
As a startup looking to grow quickly while releasing new features, Guru found that AWS could provide the reliability, scalability, and elasticity the company needed to continue to innovate. “One thing that’s great about AWS is that it’s self-service: you can move as fast as you want within the environment itself,” says Steve Mayernick, director of product marketing at Guru. “You can get in really quickly, use whatever systems are necessary for your startup, and then just iterate and iterate and iterate. You can build everything up without needing permission to engage with a third-party vendor that might slow you down.”
Guru Technologies provides knowledge management software that helps organizations manage and access critical internal information.
Benefits of AWS
- Reduced time and resources spent on Elasticsearch management
- Developed speedy new experimentation framework
- Ran experiments by replaying up to a half-billion queries
- Reduced experimentation time from weeks to hours
- Improved search relevancy by 10%
AWS Services Used
Amazon OpenSearch Service
Amazon OpenSearch Service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more. OpenSearch is an open source, distributed search and analytics suite derived from Elasticsearch. Amazon OpenSearch Service offers the latest versions of OpenSearch, support for 19 versions of Elasticsearch (1.5 to 7.10 versions), and visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5 to 7.10 versions).
Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto.
Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.