Mitigating Sensitive Data-Related Risks via Foundational Technical Review (FTR) for SaaS Solutions

By Phil Varghese, Partner Solutions Architect – AWS

Most software-as-a-service (SaaS) solutions which undergo an AWS Foundational Technical Review (FTR) ingest, manage, and store sensitive data.

Sensitive data consists mainly of personally identifiable information (PII) and protected health information (PHI). Working with sensitive data in SaaS solutions requires you to think about how customer data will be transmitted, stored, and processed without undermining the security, manageability, and performance of your SaaS solution.

The FTR is a review based on the AWS Well-Architected Framework and enables AWS Partners to identify and remediate risks in their solutions. Learn how to manage and secure sensitive data within their SaaS solutions with a focus on addressing requirements related to PII or PHI requirements in the Foundational Technical Review.

Sensitive Data in SaaS Solutions

Sensitive data, including PII and PHI, has wide ranging country and region-specific definitions. So, the definition of PII and PHI should be taken into context and consider the partner’s operating countries and user base.

For the purposes of this post, we can define PII as “information which can be used to distinguish or trace an individual’s identity, such as their name, social security number, biometric records, alone, or when combined with other personal or identifying information which is linked or linkable to a specific individual, such as date and place of birth and mother’s maiden name.” We can then define PHI as “individually identifiable health information” as per NIST.

In the context of a SaaS solution being put through an FTR, we have observed PII being primarily related to login credentials and personal data collected by the solution. For SaaS solutions, PHI requirements apply if they collect or manage health-related personal information, even when the solution may or may not be operating in the health domain, coupled with regulatory requirements of the region, such as HIPAA in the United States.

If AWS Partners are using application-based authentication, databases like Amazon Relational Database Service (Amazon RDS), Amazon Aurora, or self-managed database on Amazon Elastic Compute Cloud (Amazon EC2) are used for storing PII. Storage services like Amazon Simple Storage Service (Amazon S3), Amazon Elastic File System (Amazon EFS), and Amazon Elastic Block Store (Amazon EBS) are also used to store PII.

PHI, on the other hand, can be found in the above-mentioned services depending on the application, data, and region of operation. Partners should also consider data that may have a direct impact on their customers in addition to directly identified PHI or PII data. For example, in a SaaS application, if system metadata related to a particular form view by a given user is compromised, it may give away sales volume of a particular product for that particular user.

Data Classification

How to start addressing the sensitive data-related risks is a common question, and that’s where the data classification FTR requirement provides directions. AWS Partners must have a data classification system in place to identify sensitive data within their solution.

The data classification artifact typically manifests in the form of a data classification system related to the application where data that’s not common public knowledge (PII and PHI) are identified and documented. For example, the document may indicate that with regards to the SaaS application, a particular Amazon RDS (MySQL) table has user credentials which is PII and classified as “Sensitive.” An Amazon S3 bucket could also store images containing user information which is again PII and classified as “Confidential,” while another S3 bucket stores generic images which does not have PII or PHI and is therefore classified as “Public.”

The data classification system should act as a map for data classification, data protection (at rest and in transit), and threat mitigation strategies if there’s a data leak. As an example, Center for Internet Security (CIS) Critical Security Controls Version 8 recommends companies use labels, such as Sensitive, Confidential, and Public, and classify their data according to those labels. Similarly, the U.S. government uses a three-tier classification scheme, namely Confidential, Secret, and Top Secret for national security information as described in Executive Order 13526.

AWS Partners can use Amazon Macie to discover and protect sensitive data in S3. Amazon Macie is a fully managed data security and data privacy service that uses machine learning (ML) and pattern matching. Partners can also use AWS Glue’s PII detection and remediation feature to automatically detect PII at both the column and cell levels during an AWS Glue job run. For example, AWS Glue can identify a variety of PII data and allows partners to act on the PII, such as tracking it for audit purposes or redacting the PII before writing it into a data lake.

Amazon Comprehend also has the ability to detect PII entities in English text documents. For example, Comprehend can analyze support tickets and knowledge articles to detect PII entities and redact the text before indexing the documents in a search solution.

Protecting Data at Rest

We’ve looked at classifying data handled by an application and identifying PII and PHI. Now, let’s explore how to protect that data.

A requirement in the Sensitive Data section of FTR is to encrypt all sensitive data at rest. This ensures your data is protected at rest and helps address vulnerabilities related to data leaks. Based on the data classification system mentioned above, all data stores where PII or PHI are stored should be encrypted.

Let’s look at how we can address encryption for couple of commonly used AWS services handling sensitive data. If encryption is enabled in Amazon EBS, the service works with AWS Key Management Service (AWS KMS) to enable encryption on the volumes. Amazon S3 server-side encryption (SSE-S3) encrypts S3 objects at rest and also uses AWS KMS for key management.

For Amazon RDS and other database offerings from AWS, AWS KMS-integrated-options can be used to encrypt the database. Depending on the particular AWS service, please keep in mind the available instance types, classes, and corresponding limitations while enabling encryption. With most of our services, AWS Partners have the choice of using an AWS managed key, which is the default encryption key in AWS KMS, or create their own KMS key. We highly recommend partners rotate these encryption keys regularly.

We also recommend partners use Amazon Cognito instead of building their own app-based authentication and authorization systems. Amazon Cognito lets partners add user sign up, sign in, and access control to web and mobile apps quickly and easily. Additionally, Cognito encrypts user credentials by default.

Protecting Data in Transit

Let’s explore how to protect data while in transit. Another requirement under the Sensitive Data section for FTR is to only use network protocols with encryption when transmitting sensitive data outside your virtual private cloud (VPC). This requirement translates to using https or SSL/TLS-based communication while communicating outside the VPC.

For solutions in AWS that need to terminate TLS, AWS offers several options including load balancing services (Elastic Load Balancer, Network Load Balancer, and Application Load Balancer), Amazon CloudFront, and Amazon API Gateway. The process of generating, distributing, and rotating digital certificates used for SSL/TLS encryption can be simplified using AWS Certificate Manager.

Using services like AWS KMS, AWS CloudHSM, and AWS Certificate Manager, partners can implement a comprehensive data at rest and data in transit encryption strategy across their AWS ecosystem.

PHI and HIPAA

If the SaaS solution has customers in the U.S. and handles PHI, partners also need to have a Business Associate Addendum (BAA) in place with AWS for every AWS account containing PHI. Partners should also use HIPAA-eligible AWS services for processing and storing PHI.

Further details can be found on the AWS HIPAA compliance web page. By addressing the above, AWS Partners will satisfy the requirements in Protected Health Information section of the FTR.

General Recommendations

Another recommendation is to enable comprehensive logging throughout your solution where PII and PHI are stored and processed. These logs should also be sent to a dedicated audit account or logging solution to enable traceability and auditability in case of a data breach. This is, however, not a requirement for FTR.

Conclusion

In this post, I explained various aspects of sensitive data and associated requirements related to PII and PHI for the AWS Foundational Technical Review (FTR). For further know-how on data classification, please refer to the Data Classification: Secure Cloud Adoption AWS whitepaper and CIS Critical Security Controls Version 8.

By enabling the required controls to protect data at rest and in transit, AWS Partners can protect their customer’s data and improve the trust they have with their customers. Please refer to the corresponding product documentation on how to enable encryption for data-at-rest and in-transit.

To raise an FTR request, please register as an AWS Partner via AWS Partner Central, join the Software Path, and follow the steps in the FTR guide. Alternatively, please feel free to reach out to your partner development team contact or account team.