What is data backup?
A data backup is a copy of your system, configuration, or application data that’s stored separately from the original. Sometimes organizations may experience unexpected events like natural disasters, human errors, security events, or system failures. Data backup is a critical data protection function to decrease the risk of full or partial data loss in the case of unexpected events. It offers organizations the ability to restore systems and applications to a previously desired state.
Why is data backup important?
While every organization hopes that their systems will operate as expected at all times, isolated system components can and do fail. System-wide failure, while rare, is also possible. Data backup refers to the infrastructure, technologies, and processes that copy organizational data for restoration in case of failures. It includes a disaster recovery plan, complete with the appropriate data backup strategy and solutions in place.
Effective data backup prevents data and system loss in the event of a disaster. It helps ensure business continuity and uninterrupted service, even under unexpected conditions. Critical business systems become operational quickly, with minimal business impact.
Without appropriate data backup and recovery, systems may be offline for hours, days, or weeks. In some circumstances they may not be recoverable at all, even with the help of expert digital forensics.
What are the benefits of data backup?
We provide some more data backup benefits next.
Reduce unnecessary expenditure
System downtime can cost organizations a lot in wasted time and missed opportunities. Business reputation damage can be as difficult, if not more difficult, to recover from than an actual disaster. With an appropriate, comprehensive data backup and recovery plan in place, organizations can prepare for issues in advance and maintain their business reputation.
Meet contracted agreements
Organizations that have contracted agreements in place—such as service level agreements, partnership agreements, and vendor agreements—continue to fulfill the terms of these agreements, even during a disaster. By being able to provide uninterrupted service or a basic level of service during a disaster, you help maintain customer trust at all times.
Access version history
Version history, while not the main goal of data backup, is a beneficial side effect. It proves useful when certain changes made to the system lead to undesirable outcomes. Organizations can restore a point-in-time system snapshot if they prefer it to the current state.
Meet compliance and auditing obligations
Various legislation and industry standards worldwide require businesses to protect sensitive data and retain it for specified periods. They may impose specific data backup mechanisms as requirements to meet data protection standards. Data backup and recovery capability strengthens the organization's position during audits, and ensuring it provides evidence of data integrity and compliance.
How does data backup work?
The data backup process starts with identifying and prioritizing the criticality of an organization’s data and systems. You can then schedule regular backups with backup software to ensure critical data copies are up to date.
The schedule may include different methods and storage types for optimal coverage and cost. The copying process from live to backup storage also depends on the storage type and technologies you use.
Next we discuss how data backup methods and backup testing work.
Data backup methods
Data can be backed up by various methods. Some methods back up a full copy of the data each time, while others only copy new changes to the data. Each method has its benefits and shortcomings.
Full backup
Full backups take a full copy of all the data each time, stored as it is or compressed and encrypted. Synthetic full backups create full backups from a full backup plus one or more incremental backups.
Incremental backup
Incremental backups copy any data that has been changed since the last backup, regardless of the last backup method. Reverse incremental backups add any changed data to the last full backup.
Differential backup
Differential backups copy any data since the last full backup, regardless of whether another backup has been made with any other method in the meantime.
Mirror backup
A mirror backup is stored in a non-compressed format that mirrors all the files and configurations in the source data. It can be accessed like the original data.
Backup testing
Organizations test their backup data solutions by simulating recovery from a system (or systems) failure. They then track metrics like mean time to recovery. Rather than having backup copies live on forever and take up storage space, organizations can also schedule backup destructions with their backup software.
What are the different backup storage types?
Different storage types can store data in different ways. This depends on the medium and protocols used, including object, block, or file-based storage. Backup data storage may be fixed or portable, physical or virtual, and on premises or in the cloud. It can also be standalone or exist as part of a storage array.
Organizations typically use a combination of storage types for their data backups.
Removable storage
Removable storage temporarily connects directly to a device, then is transported to a different location. Here are some examples:
- Tape storage involves physical tapes that store digital data, like Linear Tape-Open (LTO)
- External drive types include hard disk drives (HDDs) and SSDs
- Optical disc formats include DVD and Blu-ray
Networked storage
Networked-attached storage (NAS) has a direct network connection to the device it’s backing up.
NAS has multiple drives in a single device for a larger amount of storage. A disk array has a number of storage drives in a single device, typically more than NAS. A storage area network (SAN) is a configuration of storage devices, governed by a controller, for centralized storage attached to a network.
Backup storage devices may also be virtualized. Virtual NAS, disk arrays, and the like can be used in backup situations.
Data center
A data center is a physical location that offers one or more different types of storage. Connections from an organization to the data center may be through the internet or dedicated cabling. Organizations use private on-premises data centers for on-premises data backups and cloud providers’ data centers for cloud-based backups.
Cloud-based storage
Cloud storage is off-site storage in a remote location, often in distributed data centers, where backup storage may be physical or virtualized. Cloud-based storage abstracts away much of the technical management, configuration, and maintenance of storage devices. Instead, organizations focus on rule- and policy-based management. Cloud-based backups may back up cloud-based resources and on-premises resources.
How does data recovery work?
Recovery mechanisms use the data backup to restore system state. Organizations typically identify a recovery point objective (RPO) that stipulates the exact time from which a system state should be recoverable. By working through a data recovery plan that’s been outlined in advance, organizations can become fully or partially operational in the shortest time possible.
The recovery process depends on four factors:
- Incident that lead to recovery being initiated
- Current system state and conditions
- Immediate desired state of the system
- Technologies used for backups
Instead of the actual system, sometimes virtual systems loaded with backup data may be brought online and connected to other currently operational systems. To coordinate such a task, you have to carefully preplan to anticipate these types of conditions.
What are the considerations in selecting a data backup solution?
A backup strategy should account for the different types of disasters and data security situations that affect data and systems. Selecting the types of backup storage to use in your organization depends on factors like these:
- Cost
- Time to copy and recover
- Storage persistence and scalability
- Location and energy efficiency
- Data security and compliance
Organizations must assess the desired method of storage or combination of methods of storage. They must also decide how far back in time version history should persist, according to their unique internal needs.
While it may seem redundant, it’s important to store backups across multiple different types of storage and in multiple different locations. This helps ensure there’s always an available backup, no matter the circumstances.
Many organizations choose to follow the 3-2-1 rule. This rule stipulates that for maximum recoverability in any type of failure, there should be at least three copies of data across two different types of medium, with one off-site copy.
How can AWS support your data backup requirements?
Amazon Web Services (AWS) offers world-class cloud backup and recovery solutions alongside hybrid backup configurations. This means organizations are more robustly supported in business continuity and can avoid data loss in any circumstance.
For self-managed backups, organizations can choose AWS storage solutions like Amazon Simple Storage Service (Amazon S3), Amazon Elastic Block Store (Amazon EBS), and Amazon FSx. You can also use AWS Backup as a managed solution.
AWS Backup is a fully managed backup service that makes it easy to centralize and automate the backup of data. It works across AWS services in the cloud as well as on premises using the AWS Storage Gateway.
You can benefit from using AWS Backup in many ways:
-
Centrally configure backup policies and monitor backup activity for AWS resources.
-
Automate and consolidate backup tasks previously performed service-by-service. This removes the need to create custom scripts and manual processes.
-
Restore complex backups more easily across operating systems.
Get started with data backup and recovery on AWS by creating an account today.