Skip to main content

Guidance for Record Retention Modernization on AWS

Overview

This Guidance helps you modernize your record retention to extract value from your data, while staying compliant with record-keeping rules from the U.S. Securities and Exchange Commission (SEC), Commodity Futures Trading Commission (CFTC), and the Financial Industry Regulatory Authority (FINRA). Financial service institutions (FSIs) are expected to have compliant record retention. FSIs often satisfy record retention requirements by using on-premises legacy storage solutions which do not scale, require constant hardware and software refreshes, and do not allow end-customers to easily access the data. With this Guidance, you can use cloud-native services for storing, processing, and monitoring access to data, so analysts, data scientists, and other stakeholders can work with the data while staying in compliance with regulators.

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

This Guidance uses fully managed services, such as Amazon S3, DataSync, Transfer Family, AWS Glue, Lake Formation, and Athena. These services eliminate the need to administer data processing, data storage, and data warehousing systems, so you can focus on building your applications.

Read the Operational Excellence whitepaper 

End users use AWS Identity and Access Management (IAM) single-sign on, which authorizes access to QuickSight dashboards and the Athena user interface in addition to the Amazon Redshift Query user interface (UI) for ad-hoc queries and SageMaker for machine learning (ML) projects. DataSync uses HTTPS for encryption in-transit. Transfer Family uses secure file transfer protocol (SFTP) and file transfer protocol (FTPS), which are secured by the underlying protocols based on secure shell (SSH) and transport layer security (TLS) cryptographic algorithms. Snowball supports server-side encryption at rest. Amazon S3 supports server-side and client-side encryption. 

Read the Security whitepaper 

Serverless capabilities such as Athena, AWS Glue, Lake Formation, DynamoDB, Amazon Redshift Serverless, and Amazon EMR Serverless scale with demand. Transfer Family supports up to three Availability Zones to minimize network latency. Amazon EMR supports multi-master deployments in the same Availability Zone, while Amazon Redshift uses a relocation capability that allows you to move a cluster to another Availability Zone with minimal changes to your application. DataSync recovers from network path failures and uses integrity checks and full checksums to ensure correct transfer of data.

Read the Reliability whitepaper 

With serverless services, you can use automatic scaling and recover resources, while using the minimum amount of services required for a task. 

Read the Performance Efficiency whitepaper 

In this Guidance, we use serverless services that scale automatically with demand so that you pay only for the amount of resources you use. For example, AWS Glue and Amazon EMR Serverless only consume resources when jobs are running. Users pay only for the Athena queries they run, and Amazon Redshift Serverless scales with demand. Additionally, DataSync efficiently transfers data to AWS to minimize costs. Amazon EMR can make use of transient clusters and Amazon Elastic Cloud Compute (Amazon EC2) Spot instances, which provide up to a 90% discount compared to on-demand prices. 

Read the Cost Optimization whitepaper 

By extensively using serverless services and dynamic scaling, resources are only consumed when needed. You do not need to maintain peak capacity to avoid costly application failures when scaling resources.

Read the Sustainability whitepaper 

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.