How do I implement disaster recovery or fault tolerance for my Amazon ElastiCache Redis cluster?

Last updated: 2020-10-21

I need to implement disaster recovery or fault tolerance for my Amazon ElastiCache Redis cluster data. What options are available?

Resolution

The available fault tolerance solutions each have their own balance of data durability, performance impact, and cost. Choose the one that best fits your use case:

Multi-AZ

Multi-AZ is the best option when data retention, minimal downtime, and application performance are a priority.

  • Data loss potential - Low. Multi-AZ provides fault tolerance for every scenario, including hardware-related issues.
  • Performance impact - Low. Of the available options, Multi-AZ provides the fastest time to recovery, because there is no manual procedure to follow after the process is implemented.
  • Cost - Low to high. Multi-AZ is the lowest-cost option. Use Multi-AZ when you can't risk losing data because of hardware failure or you can't afford the downtime required by other options in your response to an outage.

For more information on Multi-AZ, see Minimizing downtime in ElastiCache for Redis with Multi-AZ.

Daily automatic backups

You can schedule daily automatic backups at a time when you expect low resource utilization for your cluster. ElastiCache creates a backup of the cluster, and then writes all data from the cache to a Redis RDB file. Redis versions 2.8.22 and later implement a forkless backup that can improve performance.

Note: Redis backup and restore aren't supported on cache.t1.micro nodes for cluster mode disabled clusters.

  • Data loss potential - High (up to a day’s worth). Daily automatic backups are retained for up to 35 days.
  • Performance impact - Medium to high. Running multiple file backups throughout the day impacts performance. To improve performance, consider enabling RDB snapshots on a designated persistence only secondary node. Then, disable both RDB snapshots and Redis append-only file (AOF) on the primary node and all other secondary nodes.
  • Cost - Low to medium. Storage costs increase with the number of backups and the data retention duration.

Before implementing backup and restore, consider the limitations caused by backup constraints. For comprehensive information about implementing backups for ElastiCache clusters running Redis, see Backup and restore for ElastiCache for Redis. For more information, see Making manual backups.