[SEO Subhead]
This Guidance shows how to use Datavant’s data deidentification and tokenization tools on AWS Clean Rooms to protect private patient information while gaining important insights. Using these tools, you can replace private information with encrypted tokens that cannot be reverse engineered to reveal the original information. You and collaborators can then work in AWS Clean Rooms to analyze collective datasets without sharing or moving the underlying data sources. By seamlessly deidentifying, linking, and collaborating on data while retaining fine-grained control, your healthcare and life sciences organization can accelerate and improve insights.
Note: [Disclaimer]
Architecture Diagram
[Architecture diagram description]
Step 1
Deidentify or tokenize data in an Amazon Simple Storage Service (Amazon S3) bucket using Datavant Switchboard (container). The container is deployed through supported container deployment methods, as detailed in Step 7.
Step 2
Link the tokenized data with your fellow collaborators using the Datavant Switchboard (container) and store the output in an Amazon S3 bucket.
Step 3
Use an AWS Glue crawler to crawl the linked, tokenized data. Prepare the data source for collaboration with AWS Glue Data Catalog.
Step 4
Instantiate AWS Clean Rooms and invite the member(s) to the collaboration to align on and implement analysis rules. Members can then associate configured tables from Data Catalog and use an AWS Clean Rooms service role to access their AWS Glue tables.
Step 5
The member who is allowed to query uses Aggregate and List functions across tables in the collaboration. Results can be exported to Amazon S3 for the member who is allowed to receive query results.
Step 6
The member who receives query results can use analytics services, including Amazon Redshift, Amazon Athena, Amazon EMR, and Amazon SageMaker, to derive insights from the newly enriched dataset.
Step 7
Datavant Switchboard container deployment methods include using AWS Fargate, Amazon Elastic Kubernetes Service (Amazon EKS), and Amazon Elastic Container Service (Amazon ECS).
Get Started
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
Amazon Elastic Container Registry (Amazon ECR) prepares and builds container images, integrating with existing continuous integration and delivery build processes and supporting multistage-build and package-management deployment modalities. As two supported deployment methods, Amazon EKS monitors the health of container applications through liveliness probes, and Amazon ECS monitors it through health checks. Both methods support logging by writing to standard out (stdout) and standard error (stderr) streams.
-
Security
Amazon ECR statically scans images, dependencies, and libraries for common vulnerabilities and exposures. Additionally, Snyk is enabled for real-time container security, and Amazon Cognito generates JSON web tokens that enable secure communication to functions that require internet access. Amazon Cognito and AWS Identity and Access Management (IAM) support the principle of least privilege and enable you to avoid hard coding anything. Aside from the static and dynamic scanning completed by Amazon ECR and Snyk respectively, the container image is hardened with software built with trusted and verified signatures.
-
Reliability
AWS Clean Rooms is a regional serverless service. This service runs on the AWS global infrastructure, which is built around AWS Regions and Availability Zones. AWS Regions provide multiple physically separated and isolated Availability Zones, which are connected through low-latency, high-throughput, and highly redundant networking.
-
Performance Efficiency
AWS Clean Rooms offers optimized scalability, allowing you to seamlessly adjust your data analysis capacity based on your needs. You can automatically scale compute resources up or down depending on query workload demands. This helps ensure efficient utilization and cost optimization while maintaining data privacy and security within the collaborative environment.
-
Cost Optimization
Amazon ECS and Amazon EKS scale automatically to meet demand, helping you optimize your costs based on load. Containers listen for a SIGTERM signal to scale down, and a container’s application startup time is optimized to enable cost savings when invoking the application from a cold start. Additionally, Amazon ECR lets you set up lifecycle policies to further reduce costs.
-
Sustainability
Amazon ECR lets you right-size containers, and it regularly updates parent and base images and purges unused or obsolete container images. Additionally, Amazon EKS and Amazon ECS scale down when resources are not needed, and they support optimal architectures for container image builds. By removing unused images, scaling down when unused, and enabling you to optimize hardware for performance, these services reduce energy consumption, supporting sustainability.
Related Content
[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.