AWS Database Blog

Unlock the power of Amazon DocumentDB text search with real-world use cases

Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports JSON workloads. Amazon DocumentDB recently launched support for text search, a new native text search capability that allows you to perform complex text searches on large textual data with ease.

Traditional database queries typically limit searches to exact matches, making them inadequate for real-world text search use cases. Text search capabilities extend beyond this limitation, allowing you to search for words and phrases within your documents. Imagine searching for a specific ingredient on a restaurant menu, filtering through product descriptions on an e-commerce site, or finding relevant articles in a large content repository – all possible with text search functionality.

In this post, we explore the practical applications and benefits of using text search in Amazon DocumentDB and showcase its real-world applications through compelling use cases.

Understanding Amazon DocumentDB text search

The text search functionality of Amazon DocumentDB uses text indexes to perform a text search of specific terms or phrases on large textual data. Text indexes are specialized indexes created on fields containing text data, enabling efficient and fast searches of large textual data compared to regular indexes and regular expressions. Amazon DocumentDB supports several text-specific operators and options that enhance the search experience. Refer to Performing text search with Amazon DocumentDB for supported operators.

You specify the string “text” to create a text index on fields that contains string data. In the following example you create a text index with title and content fields for news article documents.

db.articles.createIndex({ title: "text", content: "text"});

And you use $text and $search operators to perform text searches. The following example returns all documents where your text indexed fields (i.e. title and content) contain the string “movie” including other variants like “movies”.

db.articles.find({$text: {$search: "movie"}})

Let’s examine a few practical use cases for the text search feature.

Use case: Product catalog search

Imagine a vast ecommerce platform with an extensive product catalog. Product catalog search is a common use case where you want to enable users to search and retrieve products from a catalog based on certain criteria. You can use Amazon DocumentDB text search to enable users to search for products based on names, descriptions, or even specifications. This plays a crucial role in shaping the user’s shopping experience, because it helps them discover relevant products quickly.

The following is an explanation of a product catalog search use case using text search in Amazon DocumentDB.

Users may want to search for products based on product name, description, or specification. You can create a text index with name, description, and specification using a compound text index syntax as follows:

// Create text index
db.products.createIndex({ name: "text", description: "text", specification: "text" });

The following example returns all documents that contain the word “laptop” in the indexed fields.

// Search for products
db.products.find({ $text: { $search: "laptop" } });

You can use text search in combination with other query operators to filter results based on specific criteria. In the following example, you perform a text search for products containing the text “laptop” and with an additional filter for the “electronics” category:

// Search for products with additional criteria
db.products.find({ $text: { $search: "laptop" }, category: "electronics" });

Text search score

Amazon DocumentDB assigns a score to each document based on the relevance of the search term. You can use this score to rank search results. See the following code:

// sort by score 
db.products.find(
    { $text: { $search: "smartphone" } },
    { score: { $meta: "textScore" } }
).sort({ score: { $meta: "textScore" } });

Let’s assume you have a product catalog with documents in the following format:

{
    name: "Smartphone AWSome X9 Pro ",
    description: "A high-performance smartphone with advanced features.",
    category: "electronics",
    price: 499.99,
    stock: 100,
    tags: ["mobile", "technology", "android"]
}

Let’s perform a text search for smartphones priced below $500, sorted by relevance:

db.products.find(
    {
        $text: { $search: "smartphone" },
        price: { $lt: 500 }
    },
    { score: { $meta: "textScore" } }
).sort({ score: { $meta: "textScore" } });

This query searches for documents containing the word “smartphone” with a price less than $500. The results are sorted by the relevance score assigned by the text search feature.

Use case : Content management system

For a content-heavy platform like a content management system (CMS), Amazon DocumentDB text search can help you find or retrieve articles, blog posts, or documents that match specific keywords or a phrase. This enhances content discoverability and improves user engagement.

Let’s perform a phrase search for products that contain the phrase “DocumentDB text search” in either the title or content:

// Create text index
db.articles.createIndex({ title: "text", content: "text" });

// Search for a phrase ‘DocumentDB text search’ in articles
db.articles.find({ $text: { $search: "\"DocumentDB text search\"" } });

Note the use of double quotes around the phrase to specify that it should be treated as a whole phrase. This query returns documents where the exact phrase “DocumentDB text search” is present in either the title or content field.

Use case : Job search platform

On a job search platform, users often look for positions that match their skills or interests. Amazon DocumentDB text search facilitates this process by enabling searches based on job titles, descriptions, or required skills. In the realm of job search, it becomes essential to assign higher importance, or weightage, to specific fields such as job titles or skills for a more refined and targeted search experience.

To implement a weighted index, you can assign weights to specific fields during index creation. In the following example, you assign higher weights to the title and skills fields, indicating their higher importance in the search:

db.jobs.createIndex(
    {
        title: "text",
        description: "text",
        skills: "text",
        location: "text"
    },
    {
        weights: {
            title: 3,
            skills: 2
        }
    }
)

Let’s explore how a weighted search query might look:

db.jobs.find(
    { $text: { $search: "Senior Software Engineer" } },
    { score: { $meta: "textScore" } }
).sort({ score: { $meta: "textScore" } })

In this query, Amazon DocumentDB calculates a relevance score based on the weighted index. The results are then sorted by this score in descending order, so the most relevant jobs appear at the top.

Use case : Social media mentions

Social media platforms generate vast amounts of textual data through user posts, comments, and messages. You can use Amazon DocumentDB text search to search through this data, identifying relevant mentions or discussions related to specific topics.

In the following example, you are creating a text index on the content field, then using a text search to find posts containing the specified keyword. The aggregation query groups the results by user, counts the mentions, and identifies the latest post timestamp for each user. Finally, the results are sorted by the mention count in descending order.

db.posts.aggregate([
    { $match: { $text: { $search: 'DocumentDB'} }},
    {
        $group: {
            _id: '$user',
            count: { $sum: 1 },
            latestPost: { $max: '$timestamp' }
        }
    },
    { $sort: { count: -1 } }
]);

Amazon DocumentDB text search is flexible and you can apply it to a range of use cases where effective text-based querying is necessary for data analysis and retrieval.

Conclusion

Amazon DocumentDB text search empowers you to create powerful and efficient search functionalities in various applications, from e-commerce platforms to content management systems. By understanding and harnessing the capabilities of text indexes and search operators, developers can significantly enhance the user experience and make data retrieval a seamless process. As we’ve seen through these use cases, Amazon DocumentDB text search is a valuable feature for building dynamic and responsive applications.

As always, AWS welcomes your feedback. Leave any thoughts or questions in the comments section.


About the authors

Gururaj BayariGururaj S Bayari is a Senior DocumentDB Specialist Solutions Architect at AWS. He enjoys helping customers adopt Amazon’s purpose-built databases. He helps customers design, evaluate, and optimize their internet-scale and high-performance workloads powered by NoSQL or relational databases.

Kunal Agarwal is a Senior Product Manager at AWS. Kunal is passionate about data and loves building scalable products to solve customer problems. Prior to joining AWS, Kunal spent 12 years in product management and strategy in the technology industry.