AWS Machine Learning Blog

Search for knowledge in Quip documents with intelligent search using the Quip connector for Amazon Kendra

Organizations use collaborative document authoring solutions like Salesforce Quip to embed real-time, collaborative documents inside Salesforce records. Quip is Salesforce’s productivity platform that transforms the way enterprises work together, delivering modern collaboration securely and simply across any device. A Quip repository captures invaluable organizational knowledge in the form of collaborative documents and workflows. However, finding this organizational knowledge easily and securely along with other document repositories, such as Box or Amazon Simple Storage Service (Amazon S3), can be challenging. Additionally, the conversational nature of collaborative workflows renders the traditional keyword-based approach to search ineffective due to having fragmented, dispersed information in multiple places.

We’re excited to announce that you can now use the Amazon Kendra connector for Quip to search messages and documents in your Quip repository. In this post, we show you how to find the information you need in your Quip repository using the intelligent search function of Amazon Kendra, powered by machine learning.

Solution overview

With Amazon Kendra, you can configure multiple data sources to provide a central place to search across your document repository. For our solution, we demonstrate how to configure a Quip repository as a data source of a search index using the Amazon Kendra connector for Quip.

The following screenshot shows an example Quip repository.

The workspace in this example has a private folder that is not shared. That folder has a subfolder that is used to keep expense receipts. Another folder called example.com is shared with others and used to collaborate with the team. This folder has five subfolders that hold documentation for development.

To configure the Quip connector, we first note the domain name, folder IDs, and access token of the Quip repository. Then we simply create the Amazon Kendra index and add Quip as a data source.

Prerequisites

To get started using the Quip connector for Amazon Kendra, you must have a Quip repository.

Gather information from Quip

Before we set up the Quip data source, we need a few details about your repository. Let’s gather those in advance.

Domain name

Find out the domain name. For example , for the Quip URL https://example-com.quip.com/browse, the domain name is quip. Depending on how single sign-on (SSO) is set up in your organization, the domain name may vary. Save this domain name to use later.

Folder IDs

Folders in Quip have a unique ID associated with them. We need to configure the Quip connector to access the right folders by supplying the correct folder IDs. For this post, we index the folder example.com.

To find the ID of the folder, choose the folder. The URL changes to show the folder ID.

The folder ID in this case is xj1vOyaCGB3u. Make a list of the folder IDs to scan; we use these IDs when configuring the connector.

Access token

Log in to Quip and open https://{subdomain.domain}/dev/token in a web browser. In the following example, we navigate to https://example-com.quip.com/dev/token. Then choose Get Personal Access Token.

Copy the token to use in a later step.

We now have the information we need to configure the data source.

Create an Amazon Kendra index

To set up your Amazon Kendra index, complete the following steps:

  1. Sign in to the AWS Management Console and open the Amazon Kendra console.

If you’re using Amazon Kendra for the first time, you should see the following screenshot.

  1. Choose Create an index.
  2. For Index name, enter my-quip-example-index.
  3. For Description, enter an optional description.
  4. For IAM role, use an existing role or create a new one.
  5. Choose Next.
  6. Under Access control settings, select No to make all indexed content available to all users.
  7. For User-group expansion, select None.
  8. Choose Next.

For Provisioning editions, you can choose from two options depending on the volume of the content and frequency of access.

  1. For this post, select Developer edition.
  2. Choose Create.

Role creation takes approximately 30 seconds; index creation can take up to 30 minutes. When complete, you can view your index on the Amazon Kendra console.

Add Quip as a data source

Now let’s add Quip as a data source to the index.

  1. On the Amazon Kendra console, under Data management in the navigation pane, choose Data sources.
  2. Choose Add connector under Quip.
  3. For Data source name, enter my-quip-data-source.
  4. For Description, enter an optional description.
  5. Choose Next.
  6. Enter the Quip domain name that you saved earlier.
  7. Under Secrets, choose Create and add a new Secrets Manager secret.
  8. For Secret name, enter the name of your secret.
  9. For Quip token, enter the access token you saved earlier.
  10. Choose Save and add secret.
  11. Under IAM role, choose a role or create a new one.
  12. Choose Next.
  13. Under Sync scope, for Add folder IDs to crawl, enter the folder IDs you saved earlier.
  14. Under Sync run schedule¸ for Frequency, select Run on demand.
  15. Choose Next.

The Quip connector lets you capture additional fields like authors, categories, and folder names (and even rename as needed).

  1. For this post, we don’t configure any field mappings.
  2. Choose Next.
  3. Confirm all the options and add the data source.

Your data source is ready in a few minutes.

  1. When your data source is ready, choose Sync now.

Depending on the size of the data in the Quip repository, this process can take a few minutes to a few hours. Syncing is a two-step process. First, the documents are crawled to determine the ones to index. Then the selected documents are indexed. Some factors that affect sync speed include repository throughput and throttling, network bandwidth, and the size of documents.

The sync status shows as successful when the sync is complete. Your Quip repository is now connected.

Run a search in Amazon Kendra

Let’s test the connector by running a few searches.

  1. On the Amazon Kendra console, under Data management in the navigation pane, choose Search indexed content.
  2. Enter your search in the search field. For this post, we search for EC2 on Linux.

The following screenshot shows our results.

Limitations

There are some known limitations for the data source ingestion. Some limitations are due to the need for admin access for accessing some of the content, others due to specific implementation details. They are as follows:

  • Only full crawls are supported. If you want the connector to support changelog crawls, admin API access is required, and you have to enable the admin API on the Quip website.
  • Only shared folders are crawled. Even if we use the personal access token of an admin user, we can’t crawl data in the private folders of other users.
  • The solution doesn’t support specifying file types for inclusion and exclusion, because Quip doesn’t store the file type extension, just the file name.
  • Real-time events require a subscription and admin API access.

Conclusion

The Amazon Kendra connector for Quip enables organizations to make the invaluable information stored in Quip documents available to their users securely using intelligent search powered by Amazon Kendra. The connector also provides facets for Quip repository attributes such as authors, file type, source URI, creation dates, parent files, and category so users can interactively refine the search results based on what they’re looking for.

For more information on how you can create, modify, and delete data and metadata using custom document enrichment as content is ingested from the Quip repository, refer to Customizing document metadata during the ingestion process and Enrich your content and metadata to enhance your search experience with custom document enrichment in Amazon Kendra.


About the Authors

Ashish Lagwankar is a Senior Enterprise Solutions Architect at AWS. His core interests include AI/ML, serverless, and container technologies. Ashish is based in the Boston, MA, area and enjoys reading, outdoors, and spending time with his family.

Vikas Shah is an Enterprise Solutions Architect at Amazon web services. He is a technology enthusiast who enjoys helping customers find innovative solutions to complex business challenges. His areas of interest are ML, IoT, robotics and storage. In his spare time, Vikas enjoys building robots, hiking, and traveling.