AWS Machine Learning Blog
Simplify secure search solutions with Amazon Kendra’s Principal Store
For many enterprises, critical business information is often stored as unstructured data scattered across multiple content repositories. It is challenging for organizations to make this information available to users when they need it. It is also difficult to do so securely so that relevant information is available to the right users or user groups. Different content repositories have different mechanisms to associate users with groups. These users and groups also change over time. This makes it complex to ensure that the results of a user’s search query only include documents the user is authorized to read.
Amazon Kendra, a highly accurate and easy-to-use intelligent search service powered by machine learning (ML), is releasing a new feature, the Principal Store, to simplify secure search by improving the management of user and group associations across different content repositories.
While documents are ingested for an Amazon Kendra search index, the access control lists (ACLs) for those documents are extracted by the data source connectors in Amazon Kendra and ingested as metadata along with the document. The ACLs are based on user and group information from identity providers (IdPs) from the underlying repositories. If you’re building a custom connector or using the BatchPutDocument API, you can additionally extract data source groups (local groups) and data source IDs.
The Principal Store feature provides an Application Programming Interface (API) to store the mapping between a user ID and the groups that it has access to. When a customer application issues a query to the Amazon Kendra index, the application only needs to send the user ID. Amazon Kendra looks up the user ID in the Principal Store and finds the corresponding groups. It then filters the results based on the groups the user has access to.
About the Principal Store for secure search
The Principal Store is an Amazon Kendra feature that allows you to store the mapping information between users and groups, and also between groups and other groups. For example, in your organization you could have a group for technical employees and another for sales employees. Technical employees are further divided into two groups: one for support and one for engineering. Sales, on the other hand, has a group for account managers and another group for customer service.
With the Amazon Kendra Principal, you can map groups inside other groups, allowing you to define which documents are exposed to which groups and ultimately which users. When you create a PrincipalMapping, you define the group ID (the identifier of the group you want to map its users to) and the group members (users or subgroups that belong the same group).
For example, the following request illustrates how to map the user
John to the group
In the AWS Command Line Interface (AWS CLI), this operation looks like the following code:
Similarly, the users Mary, Patricia, and James are added to the Engineering, Account-Managers, and Customer-Service groups, respectively, using the following commands:
Now that we have both groups
Engineering ready, we can map them to the group
Let’s run the previous example and a similar one for
Sales in the AWS CLI:
Finally, with both the
Sales groups populated, we add them to the group
The operation in the AWS CLI is as follows:
How filters and queries work
To demonstrate how filters and queries work, we continue with our example and use a document repository with document types and read permissions summarized in the following table.
|Document Type||Read Permissions|
|Case Studies||Account Managers, Engineering|
|User Guides||Customer Service, Support|
In the following screenshot, we set
John as the username in the Amazon Kendra search console and issued a query. The answer is from a whitepaper, and the results are from user guides (
John belongs to
Support), whitepapers (
Support is part of
Technical), and blogs (
Technical is part of
Company). We didn’t specify any groups on the query, so Amazon Kendra is obtaining them from the Principal Store and automatically setting the filters.
Let’s see the results we get by setting the username to
Patricia. The following screenshot shows that that the results are from blogs (
Patricia is in the
Account-Managers group, which in turn is in the
Sales group, which is in the
Company group), case studies (
Patricia is in the
Account-Managers group), and analyst reports (
Account-Managers is in the
Sales group). Again, we didn’t specify any groups, just the username.
Now let’s use
Sales as the group and issue the same query. This time the results are from blogs (
Sales is in
Company) and analyst reports (which can be read by
This time, let’s issue the query with the group set as
Company. The results are only from blogs because blogs are the only type of documents visible to the group
As you can see, we can use the Principal Store to ingest the mapping between users and groups. At query time, Amazon Kendra filters the results based on the mapping of users and groups by using the user ID or group that is provided as part of the query.
In this post, you learned how to map your users and groups on Amazon Kendra, which allows you to automatically apply filters based on your user ID or group. You can dive deeper into Amazon Kendra by learning more about how to enrich your content or by using the Amazon Kendra Essentials workshop.
About the Authors
Abhinav Jawadekar is a Senior Partner Solutions Architect at Amazon Web Services. Abhinav works with AWS Partners to help them in their cloud journey.
Vijai Gandikota is a Senior Product Manager at Amazon Web Services for Amazon Kendra.
Juan Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.