Strengthen the security of sensitive data stored in Amazon S3 by using additional AWS services
October 13, 2021: We’ve added a section on redacting and transforming personally identifiable information with Amazon S3 Object Lambda.
In this post, we describe the AWS services that you can use to both detect and protect your data stored in Amazon Simple Storage Service (Amazon S3). When you analyze security in depth for your Amazon S3 storage, consider doing the following:
- Audit and restrict Amazon S3 access with AWS Identity and Access Management (IAM) Access Analyzer
- Classify and secure sensitive data with Amazon Macie
- Detect malicious access patterns with Amazon GuardDuty
- Monitor and remediate configuration changes with AWS Config
- Redact and transform personally identifiable information (PII) with Amazon S3 Object Lambda
Using these additional AWS services along with Amazon S3 can improve your security posture across your accounts.
Audit and restrict Amazon S3 access with IAM Access Analyzer
IAM Access Analyzer allows you to identify unintended access to your resources and data. Users and developers need access to Amazon S3, but it’s important for you to keep users and privileges accurate and up to date.
Amazon S3 can often house sensitive and confidential information. To help secure your data within Amazon S3, you should be using AWS Key Management Service (AWS KMS) with server-side encryption at rest for Amazon S3. It is also important that you secure the S3 buckets so that you only allow access to the developers and users who require that access. Bucket policies and access control lists (ACLs) are the foundation of Amazon S3 security. Your configuration of these policies and lists determines the accessibility of objects within Amazon S3, and it is important to audit them regularly to properly secure and maintain the security of your Amazon S3 bucket.
IAM Access Analyzer can scan all the supported resources within a zone of trust. Access Analyzer then provides you with insight when a bucket policy or ACL allows access to any external entities that are not within your organization or your AWS account’s zone of trust.
The example in Figure 1 shows creating an analyzer with the zone of trust as the current account, but you can also create an analyzer with the organization as the zone of trust.
After you create your analyzer, IAM Access Analyzer automatically scans the resources in your zone of trust and returns the findings from your Amazon S3 storage environment. The initial scan shown in Figure 2 shows the findings of an unsecured S3 bucket.
For each finding, you can decide which action you would like to take. As shown in figure 3, you are given the option to archive (if the finding indicates intended access) or take action to modify bucket permissions (if the finding indicates unintended access).
After you address the initial findings, Access Analyzer monitors your bucket policies for changes, and notifies you of access issues it finds. Access Analyzer is regional and must be enabled in each AWS Region independently.
Classify and secure sensitive data with Macie
Organizational compliance standards often require the identification and securing of sensitive data. Your organization’s sensitive data might contain personally identifiable information (PII), which includes things such as credit card numbers, birthdates, and addresses.
Macie is a data security and privacy service offered by AWS that uses machine learning and pattern matching to discover the sensitive data stored within Amazon S3. You can define your own custom type of sensitive data category that might be unique to your business or use case. Macie will automatically provide an inventory of S3 buckets and alert you of unprotected sensitive data.
Figure 4 shows a sample result from a Macie scan in which you can see important information regarding Amazon S3 public access, encryption settings, and sharing.
In addition to finding potential sensitive data, Macie also gives you a severity score based on the privacy risk, as shown in the example data in Figure 5.
When you use Macie in conjunction with AWS Step Functions, you can also automatically remediate any issues found. You can use this combination to help meet regulations such as General Data Protection Regulation (GDPR) and Health Insurance Portability and Accountability Act (HIPAA). Macie allows you to have constant visibility of sensitive data within your Amazon S3 storage environment.
When you deploy Macie in a multi-account configuration, your usage is rolled up to the master account to provide the total usage for all accounts and a breakdown across the entire organization.
Detect malicious access patterns with GuardDuty
Your customers and users can commit thousands of actions each day on S3 buckets. Discerning access patterns manually can be extremely time consuming as the volume of data increases. GuardDuty uses machine learning, anomaly detection, and integrated threat intelligence to analyze billions of events across multiple accounts and uses data collected in AWS CloudTrail logs for S3 data events as well as S3 access logs, VPC Flow Logs, and DNS logs. GuardDuty can be configured to analyze these logs and notify you of suspicious activity, such as unusual data access patterns, unusual discovery API calls, and more. After you receive a list of findings on these activities, you will be able to make informed decisions to secure your S3 buckets.
Figure 6 shows a sample list of findings returned by GuardDuty which shows the finding type, resource affected, and count of occurrences.
You can select one of the results in Figure 6 to see the IP address and details associated from this potential malicious IP caller, as shown in Figure 7.
Monitor and remediate configuration changes with AWS Config
Configuration management is important when securing Amazon S3, to prevent unauthorized users from gaining access. It is important that you monitor the configuration changes of your S3 buckets, whether the changes are intentional or unintentional. AWS Config can track all configuration changes that are made to an S3 bucket. For example, if an S3 bucket had its permissions and configurations unexpectedly changed, using AWS Config allows you to see the changes made, as well as who made them.
With AWS Config, you can set up AWS Config managed rules that serve as a baseline for your S3 bucket. When any bucket has configurations that deviate from this baseline, you can be alerted by Amazon Simple Notification Service (Amazon SNS) of the bucket being noncompliant.
AWS Config can be used in conjunction with a service called AWS Lambda. If an S3 bucket is noncompliant, AWS Config can trigger a preprogrammed Lambda function and then the Lambda function can resolve those issues. This combination can be used to reduce your operational overhead in maintaining compliance within your S3 buckets.
Figure 8 shows a sample of AWS Config managed rules selected for configuration monitoring and gives a brief description of what the rule does.
Figure 9 shows a sample result of a non-compliant configuration and resource inventory listing the type of resource affected and the number of occurrences.
Redact and transform PII with Amazon S3 Object Lambda
Your ability to share data across users, teams, and organizations often depends on the information the data contains. For example, a dataset containing PII might be consumed by multiple applications. To control access to the sensitive data, you might need to process the dataset in different ways, depending on the consuming application, before it is shared. You can use S3 Object Lambda and Amazon Comprehend together to help protect your PII in a cost-effective and accurate manner, and at scale. S3 Object Lambda can invoke Comprehend’s highly accurate natural language processing (NLP)-based PII detection and redaction APIs when an object in S3 is accessed through a GET call. With this integration, you don’t need to maintain different versions of the same data and can redact different PII elements based on access controls, without modifying the underlying data.
To use S3 Object Lambda and Amazon Comprehend together, you can create an S3 Object Lambda Access Point directly from the S3 Management Console, select an AWS-built Lambda function, or invoke a Lambda function in your account, as shown in Figure 10.
After you provide a supporting S3 Access Point to give the S3 Object Lambda access to the original object, you can update your application configuration to use the new S3 Object Lambda Access Point to retrieve objects and data from S3.
You can learn more about the integration between S3 Object Lambda and Comprehend by watching this 5-minute demo on how to configure PII redaction using S3 Object Lambda Access Points, or by reviewing step-by-step instructions to implement Amazon S3 Object Lambda to process and modify data retrieved from Amazon S3.
S3 Object Lambda Access Points work across multiple object types, including images, JSON, and CSV files, providing a flexible way to help meet your needs across multiple data consumers and applications, as shown in Figure 11.
AWS has many offerings to help you audit and secure your storage environment. In this post, we discussed the particular combination of AWS services that together will help reduce the amount of time and focus your business devotes to security practices. This combination of services will also enable you to automate your responses to any unwanted permission and configuration changes, saving you valuable time and resources to dedicate elsewhere in your organization.
For more information about pricing of the services mentioned in this post, see AWS Free Tier and AWS Pricing. For more information about Amazon S3 security, see Amazon S3 Preventative Security Best Practices in the Amazon S3 User Guide.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.