AWS Public Sector Blog

Securing your data by knowing your data

In many organisations, IT security and data governance processes can be complex since data is stored across multiple environments and applications. Data privacy, including data classification, is now a core component of security requirements. Organisations need an easier and more pragmatic approach in administering their data assets to mitigate operational risk.

Mandatory data privacy regulations, including General Data Protection Regulation (GDPR), Health Insurance Portability and Accountability Act (HIPAA), and Information Security Manual (ISM) data stores were administered by IT security based on their environment, application, group of users, and/or use cases. Now, a data store’s classification must consider confidentiality, availability, and integrity as a baseline for data security.

  • Confidentiality – Only authorized access permitted.
  • Integrity – Completeness, accuracy and freedom from unauthorized change.
  • Availability – Accessibility and usability when required.

Amazon Simple Storage Service (Amazon S3) together with Amazon Identity Access Management (IAM) can assist your organisation in meeting its IT and data security requirements.

Amazon S3 object tags for data security and classification

Amazon S3 includes a feature called object tags, which when used in combination with Amazon IAM, can granularly control access to objects in Amazon S3. Object tags are user created key/value pairs that you can add to Amazon S3 buckets or objects. Just like jpeg image EXIF data, object tags are not added to the content of the S3 object.

For example, you could create key/value pairs such as:

Data domains in Amazon S3, providing security with context

Adding Amazon S3 object tags to objects within data domain hierarchies allows you to create a logical and pragmatic way to secure your data while also providing context for the data. Let’s look at an example.

An organisation has two data domains in Amazon S3; Wireless and Facility:

Figure 1 – Data domains and classification

By adding a data classification to each object, we can see the relationship between the data domain and the level of data classification. If we wanted to add further granularity to the Amazon S3 objects, we could add a use case, such as analytics, or flag that identifies that the data has personally identifiable information (PII). Learn more about S3 object tags.

Securing Amazon S3 with Amazon IAM

Once Amazon S3 object tags have been created, you can create Amazon IAM policies that use object tag values to control access to the Amazon S3 objects. Creating IAM policy names that are self-describing provides a comprehendible way for anyone looking through policies in Amazon IAM to understand its purpose.

Referring to the table below as an example, using a combination of the data domain, data classification, and use case as the IAM policy names, makes understanding the purpose of each policy clear.

Learn more about Amazon S3 object tags and IAM.

In addition to Amazon S3 object tag security, additional Amazon S3 security features include:

  • Encryption at rest
  • Object logging
  • API logging
  • Deletion prevention
  • Versioning

Learn more about Amazon S3 security here and here.

Data integrity, usability, and consistency through Amazon S3 metadata tags

Providing your data users with assurance that the data they are using is accurate and fit for purpose is critical when it comes to an organisation enabling its data.

Amazon S3, by default, provides object metadata for every object including:

  • Date
  • Content-Length
  • Last-Modified
  •  Content-MD5

In addition, Amazon S3 includes a feature called user-defined metadata, which allows custom metadata to be added for each object. User defined metadata tags are user created key/value pairs that you can add to Amazon S3 objects. Just like jpeg image EXIF data, object tags are not added to the content of the S3 object.

By adding user-defined metadata to Amazon S3 objects, a user can gain assurance and a deeper understanding of the data. We regularly see the following metadata included with Amazon S3 objects:

Learn more about Amazon S3 user-defined metadata.

Next steps

If you are ready to get started with a data lake, please download the project template here. And join us at this year’s AWS Immersion Day, held across eight cities in Australia.  Have your questions answered, engage with our team of solutions architects, and discover solutions to your technology challenges.


A post by Paul Macey, Specialist Solutions Architect, Big Data and Analytics, AWS