Simplify Management of Amazon Redshift Snapshots using AWS Lambda
NOTE: Amazon Redshift now supports creating an automatic snapshot schedule using the snapshot scheduler. For more information, please review this “What’s New” post.
Ian Meyers is a Solutions Architecture Senior Manager with AWS
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. A cluster is automatically backed up to Amazon S3 by default, and three automatic snapshots of the cluster are retained for 24 hours. You can also convert these automatic snapshots to ‘manual’, which means they are kept forever. Snapshots are incremental, so they only store the changes made since the last snapshot was taken, and are very space efficient.
You can restore manual snapshots into new clusters at any time, or you can use them to do table restores, without having to use any third-party backup/recovery software. (For an overview of how to build systems that use disaster recovery best practices, see the AWS white paper Using AWS for Disaster Recovery.)
When creating cluster backups for a production system, you must carefully consider two dimensions:
- RTO: Recovery Time Objective. How long does it take to recover from disaster recovery scenario?
- RPO: Recovery Point Objective. When you have recovered, to what point in time will the system be consistent?
Recovery Time Objective
When using Amazon Redshift, your RTO is determined by the node type you are using, how many of those nodes you have, and the size of the data they store. It is vital that you practice restoration from snapshots created on the cluster to correctly determine Recovery Time Objective. It is also important that you re-test the restore performance any time you resize the cluster or your data volume changes significantly.
Recovery Point Objective
Automated backups are triggered based on a threshold of blocks changed or after a certain amount of time. For a cluster with minimal changes to data, a backup is taken after approximately every 8 hours. For a cluster which churns a massive amount of data, backups can be taken several times per hour. If you find that your data churn rate isn’t triggering automated backups at a frequency which satisfies your RPO then this utility can be leveraged to supplement the existing automated backups with additional manual snapshots in order to guarantee the targeted RPO.
We’ve just launched a new Amazon Redshift Utils module that helps you manage the Snapshots that your cluster creates. You supply a simple configuration, and then AWS Lambda ensures that you have cluster snapshots as frequently as required to meet your RPO. It also manages the retention of the snapshots it creates, and will allow you to create layered backup schedules to meet backup requirements for the short term and long term .
Have a look at the Snapshot Manager project and let us know how it works for you!
If you have questions or suggestions, please leave a comment below.