AWS Security Blog
Ransomware mitigation: Top 5 protections and recovery preparation actions
In this post, I’ll cover the top five things that Amazon Web Services (AWS) customers can do to help protect and recover their resources from ransomware. This blog post focuses specifically on preemptive actions that you can take.
#1 – Set up the ability to recover your apps and data
In order for a traditional encrypt-in-place ransomware attempt to be successful, the actor responsible for the attempt must be able to prevent you from accessing your data, and then hold your data for ransom. The first thing that you should do to protect your account is to ensure that you have the ability to recover your data, regardless of how it was made inaccessible. Backup solutions protect and restore data, and disaster recovery (DR) solutions offer fast recovery of data and workloads.
AWS makes this process significantly easier for you with services like AWS Backup, or CloudEndure Disaster Recovery, which offer robust infrastructure DR. I’ll go over how you can use both of these services to help recover your data. When you choose a data backup solution, simply creating a snapshot of an Amazon Elastic Compute Cloud (Amazon EC2) instance isn’t enough. A powerful function of the AWS Backup service is that when you create a backup vault, you can use a different customer master key (CMK) in the AWS Key Management Service (AWS KMS). This is powerful because the CMK can have a key policy that allows AWS operators to use the key to encrypt the backup, but you can limit decryption to a completely different principal.
In Figure 1, I show an account that locally encrypted their EC2 Amazon Elastic Block Store (Amazon EBS) volume by using CMK A, but AWS Backup uses CMK B. If the user in account A with a decrypt grant on CMK A attempts to access the backup, even if the user is authorized by the AWS Identity and Access Management (IAM) principal access policy, the CMK policy won’t allow access to the encrypted data.
If you place the backup or replication into a separate account that is dedicated just for backup, this also helps to reduce the likelihood that a threat actor would be able to destroy or tamper with the backup. AWS Backup now natively supports this cross-account capability, which makes the backup process even easier. The AWS Backup Developer Guide provides instructions for using this functionality, as well as the policy that you will need to apply.
Make sure that you’re backing up your data in all supported services and that your backup schedule is based on your business recovery time objective (RTO) and recovery point objective (RPO).
Now, let’s take a look at how CloudEndure Disaster Recovery works.
The high-level architecture diagram in Figure 2 illustrates how CloudEndure Disaster Recovery keeps your entire on-premises environment in sync with replicas in AWS and ready to fail over to AWS at any time, with aggressive recovery objectives and significantly reduced total cost of ownership (TCO). On the left is the source environment, which can be composed of different types of applications—in this case, I give Oracle databases and SQL Servers as examples. And although I’m highlighting DR from on-premises to AWS in this example, CloudEndure Disaster Recovery can provide the same functionality and improved recovery performance between AWS Regions for your workloads that are already in AWS.
The CloudEndure Agent is deployed on the source machines without requiring any kind of reboot and without impacting performance. That initiates nearly continuous replication of that data into AWS. CloudEndure Disaster Recovery also provisions a low-cost staging area that helps reduce the cost of cloud infrastructure during replication, and until that machine actually needs to be spun up during failover or disaster recovery tests.
When a customer experiences an outage, CloudEndure Disaster Recovery launches the machines in the appropriate AWS Region VPC and target subnets of your choice. The dormant lightweight state, called the Staging Area, is now launched into the actual servers that have been migrated from the source environment (the Oracle databases and SQL Servers, in this example). One of the features of CloudEndure Disaster Recovery is point-in-time recovery, which is important in the event of a ransomware event, because you can use this feature to recover your environment to a previous consistent point in time of your choosing. In other words, you can go back to the environment you had prior to the event.
The machine conversion technology in CloudEndure Disaster Recovery means that those replicated machines can run natively within AWS, and the process typically takes just minutes for the machines to boot. You can also conduct frequent DR readiness tests without impacting replication or user activities.
Another service that’s useful for data protection is the AWS object storage service, Amazon Simple Storage Service (Amazon S3), where you can use features such as object versioning to help prevent objects from being overwritten with ransomware-encrypted files, or Object Lock, which provides a write once, read many (WORM) solution to help prevent objects from ever being modified or overwritten.
For more information on developing a DR plan and a business continuity plan, see the following pages:
- Backup and Disaster Recovery
- Plan for Disaster Recovery (from the AWS Well-Architected Framework)
#2 – Encrypt your data
In addition to holding data for ransom, more recent ransomware events increasingly use double extortion schemes. A double extortion is when the actor not only encrypts the data, but exfiltrates the data and threatens to release the data if the ransom isn’t paid.
To help protect your data, you should always enable encryption of the data and segment your workflow so that authorized systems and users have limited access to use the key material to decrypt the data.
As an example, let’s say that you have a web application that uses an API to write data objects into an S3 bucket. Rather than allowing the application to have full read and write permissions, limit the application to just a single operation (for example, PutObject). Smaller, more reusable code is also easier to manage, so segmenting the workflow also helps developers to be able to work more quickly. An example of this type of workflow, in which separate CMK policies are used for read operations and write operations to limit access, is laid out in Figure 3.
It’s important to note that although AWS managed CMKs can help you to meet regulatory requirements for data at rest encryption, they don’t support customer key policies. Customers who want to control how their key material is used must use a customer managed CMK.
For data that is stored locally on Amazon EBS, remember that while the blocks are encrypted by using AWS KMS, after the server boots, your data is unencrypted locally at the operating system level. If you have sensitive data that is being stored as part of your application locally, consider using tooling like the AWS Encryption SDK or Encryption CLI to store that data in an encrypted format.
As Amazon Chief Technology Officer Werner Vogels says, encrypt everything!
#3 – Apply critical patches
In order for an actor to get access to a system, they must take advantage of a vulnerability or misconfiguration. Although many organizations patch their infrastructure, some only do so on a weekly or monthly basis, and that can be inadequate for patching critical systems that require 24/7 operation. Increasingly, threat actors have the ability to reverse engineer patches or common vulnerability exposure (CVE) announcements in hours. You should deploy security-related patches, especially those that are high severity, with the least amount of delay possible.
AWS Systems Manager can help you to automate this process in the cloud and on premises. With Systems Manager patch baselines, you can apply patches based on machine tags (for example, development versus production) but also based on patch type. For example, the predefined patch baseline AWS-AmazonLinuxDefaultPatchBaseline approves all operating system patches that are classified as “Security” and that have a severity level of “Critical” or “Important.” Patches are auto-approved seven days after release. The baseline also auto-approves all patches with a classification of “Bugfix” seven days after release.
If you want a more aggressive patching posture, you can instead create a custom baseline. For example, in Figure 5, I’ve created a baseline for all Windows versions with a critical severity.
I can then set up an hourly scheduled event to scan all or part of my fleet and patch based on this baseline. In Figure 6, I show an example of this type of workflow taken from this AWS blog post, which gives an overview of the patch baseline process and covers how to use it in your cloud environment.
In addition, if you’re using AWS Organizations, this blog post will show you how you can apply this method organization-wide.
AWS offers many tools to make patching easier, and making sure that your servers are fully patched will greatly reduce your susceptibility to ransomware.
#4 – Follow a security standard
Don’t guess whether your environment is secure. Most commercial and public-sector customers are subject to some form of regulation or compliance standard. You should be measuring your security and risk posture against recognized standards in an ongoing practice. If you don’t have a framework that you need to follow, consider using the AWS Well-Architected Framework as your baseline.
With AWS Security Hub, you can view data from AWS security services and third-party tools in a single view and also benchmark your account against standards or frameworks like the CIS AWS Foundations Benchmark, the Payment Card Industry Data Security Standard (PCI DSS), and the AWS Foundational Security Best Practices. These are automated scans of your environment that can alert you when drifts in compliance occur. You can also choose to use AWS Config conformance packs to automate a subset of controls for NIST 800-53, Health Insurance Portability and Accountability Act (HIPAA), Korea – Information Security Management System (ISMS), as well as a growing list of over 60 conformance pack templates at the time of this publication.
Another important aspect of following best practices is to implement least privilege at all levels. In AWS, you can use IAM to write policies that enforce least privilege. These policies, when applied through roles, will limit the actor’s capability to advance in your environment. Access Analyzer is a new feature of IAM that allows you to more easily generate least privilege permissions, and it is covered in this blog post.
#5 – Make sure you’re monitoring and automating responses
Make sure you have robust monitoring and alerting in place. Each of the items I described earlier is a powerful tool to help you to protect against a ransomware event, but none will work unless you have strong monitoring in place to validate your assumptions.
Here, I want to provide some specific examples based on the examples earlier in this post.
If you’re backing up your data by using AWS Backup, as described in item #1 (Set up the ability to recover your apps and data), you should have Amazon CloudWatch set up to send alerts when a backup job fails. When an alert is triggered, you also need to act on it. If your response to an AWS alert email would be to re-run the job, you should automate that workflow by using AWS Lambda. If a subsequent failure occurs, open a ticket in your ticketing service automatically or page your operations team.
If you’re encrypting all of your data, as described in item #2 (Encrypt your data), are you watching AWS CloudTrail to see when AWS KMS denies permission to an operation?
Additionally, are you monitoring and acting on patch management baselines as described in item #3 (Apply critical patches) and responding when a patch isn’t able to successfully deploy?
Last, are you watching the compliance status of your Security Hub compliance reports and taking action on findings? You also need to monitor your environment for suspicious activity, investigate, and act quickly to mitigate risks. This is where Amazon GuardDuty, Security Hub, and Amazon Detective can be valuable.
AWS makes it easier to create automated responses to the alerts I mentioned earlier. The multi-account response solution in this blog post provides a good starting point that you can use to customize a response based on the needs of your workload.
Conclusion
In this blog post, I showed you the top five actions that you can take to protect and recover from a ransomware event.
In addition to the advice provided here, NIST has recently published guidance on the prevention of ransomware, which you can view in the NIST SP1800-25 publication.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.