Guidance for Record Retention Modernization on AWS

This Guidance helps you modernize your record retention to extract value from your data, while staying compliant with record-keeping rules from the U.S. Securities and Exchange Commission (SEC), Commodity Futures Trading Commission (CFTC), and the Financial Industry Regulatory Authority (FINRA). Financial service institutions (FSIs) are expected to have compliant record retention. FSIs often satisfy record retention requirements by using on-premises legacy storage solutions which do not scale, require constant hardware and software refreshes, and do not allow end-customers to easily access the data. With this Guidance, you can use cloud-native services for storing, processing, and monitoring access to data, so analysts, data scientists, and other stakeholders can work with the data while staying in compliance with regulators.

Please note: [Disclaimer]

Architecture Diagram

Download the architecture diagram PDF

Guidance Architecture Diagram for Record Retention Modernization on AWS

Step 1
Transaction data is created in the line of business applications.

Step 2
AWS DataSync, AWS Transfer Family, or AWS Snowball transfer data to an AWS Region.

Step 3
Amazon Simple Storage Service (Amazon S3) stores data in its raw form.

Step 4
AWS Glue crawlers discover and catalog the raw data.

Step 5
Customers can process the raw data using AWS Glue Studio jobs or Amazon EMR.

Step 6
Amazon DynamoDB stores job details, results, and other metadata for auditing purposes.

Step 7
AWS Glue Data Catalog stores processed data schema and partition information.

Step 8
S3 buckets store processed data for retention, configured with S3 Object Lock in Compliance Mode, with a default retention period that matches compliance requirements.

Step 9
AWS Lake Formation provides access control and governance, which enables granular access control on a database-, table-, or column-level.

Step 10
End users, such as the record management team, data science teams, auditors, and designated third-parties (D3P), access the data through services such as Amazon Athena, Amazon Redshift Spectrum, and Amazon SageMaker.

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Operational Excellence

This Guidance uses fully managed services, such as Amazon S3, DataSync, Transfer Family, AWS Glue, Lake Formation, and Athena. These services eliminate the need to administer data processing, data storage, and data warehousing systems, so you can focus on building your applications.

Read the Operational Excellence whitepaper
Security

End users use AWS Identity and Access Management (IAM) single-sign on, which authorizes access to QuickSight dashboards and the Athena user interface in addition to the Amazon Redshift Query user interface (UI) for ad-hoc queries and SageMaker for machine learning (ML) projects. DataSync uses HTTPS for encryption in-transit. Transfer Family uses secure file transfer protocol (SFTP) and file transfer protocol (FTPS), which are secured by the underlying protocols based on secure shell (SSH) and transport layer security (TLS) cryptographic algorithms. Snowball supports server-side encryption at rest. Amazon S3 supports server-side and client-side encryption.

Read the Security whitepaper
Reliability

Serverless capabilities such as Athena, AWS Glue, Lake Formation, DynamoDB, Amazon Redshift Serverless, and Amazon EMR Serverless scale with demand. Transfer Family supports up to three Availability Zones to minimize network latency. Amazon EMR supports multi-master deployments in the same Availability Zone, while Amazon Redshift uses a relocation capability that allows you to move a cluster to another Availability Zone with minimal changes to your application. DataSync recovers from network path failures and uses integrity checks and full checksums to ensure correct transfer of data.

Read the Reliability whitepaper
Performance Efficiency

With serverless services, you can use automatic scaling and recover resources, while using the minimum amount of services required for a task.

Read the Performance Efficiency whitepaper
Cost Optimization

In this Guidance, we use serverless services that scale automatically with demand so that you pay only for the amount of resources you use. For example, AWS Glue and Amazon EMR Serverless only consume resources when jobs are running. Users pay only for the Athena queries they run, and Amazon Redshift Serverless scales with demand. Additionally, DataSync efficiently transfers data to AWS to minimize costs. Amazon EMR can make use of transient clusters and Amazon Elastic Cloud Compute (Amazon EC2) Spot instances, which provide up to a 90% discount compared to on-demand prices.

Read the Cost Optimization whitepaper
Sustainability

By extensively using serverless services and dynamic scaling, resources are only consumed when needed. You do not need to maintain peak capacity to avoid costly application failures when scaling resources.

Read the Sustainability whitepaper

Implementation Resources

A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

Open implementation guide

Open sample code on GitHub

Architecture Diagram

Well-Architected Pillars

Implementation Resources

Related Content

How financial institutions modernize record retention on AWS

Disclaimer

Was this page helpful?

Guidance for Record Retention Modernization on AWS

Architecture Diagram

Well-Architected Pillars

Implementation Resources

Related Content

How financial institutions modernize record retention on AWS

Disclaimer

Was this page helpful?

Ending Support for Internet Explorer