AWS Storage Blog

Detect malware threats using AWS Transfer Family

Securely sharing files over SFTP, FTP, and FTPS is a staple within many business-to-business (B2B) workflows. Across industries, companies use file transfer to transmit inventory, invoice, and compliance information. It is critical for companies to make sure that shared files do not have any malicious content that could compromise their systems. Guaranteeing the shared files are free of malicious content includes continuous monitoring to detect and respond to security findings for your workloads. As ransomware events continue to become more prevalent and evolve, protecting the integrity of data is paramount for companies. Establishing preventive measures to protect against these events can help companies avoid financial losses and reputational damage. For your file transfer workloads, you can scan each file you receive and isolate compromised files before they ever reach your downstream systems. New files only reach your systems after an automated vetting process runs a series of security tooling, such as anti-malware checks.

Many customers choose AWS Transfer Family as their secure transfer service for files in Amazon Simple Storage Service (Amazon S3) and Amazon Elastic File System (Amazon EFS). Transfer Family supports industry-standard protocols like SFTP, so you can quickly replace self-managed resources with a fully managed environment. Transfer Family managed workflows let you preprocess files using predefined steps such as copy, tag, decrypt, and custom preprocessing with AWS Lambda. For example, you can bring your own code to scan for malicious content, obfuscate personally identifiable information (PII), encrypt sensitive information, and send notifications. Transfer Family managed workflows automatically initialize when the file upload completes and send logs to Amazon CloudWatch.

In this post, we demonstrate how to configure a workflow that invokes Clam Antivirus (ClamAV), an open-source anti-malware engine for detecting trojans, viruses, malware, and other malicious threats. This post provides a means to continuously monitor for threats, isolate malicious content, and periodically update the signature definitions to be prepared for the latest threats. This post includes AWS CloudFormation for one-click deploying this pattern into your environment.

Solution overview

The purpose of this sample architecture is to create a Transfer Family server with a managed workflow which scans each uploaded file with ClamAV. This Transfer Family server uses the SFTP protocol to transfer files into Amazon S3, but workflows support SFTP, FTPS, and FTP Transfer Family servers and file transfers into Amazon EFS. The sample shows file transfers into Amazon S3 and a workflow for scanning files as you receive them. However, the ClamAV workflow step can be added in your current workflow. For example, you may want to decrypt the file first, then scan it with ClamAV. This sample includes a way to refresh the ClamAV code and signature definitions so your workflow step can stay up-to-date without manual effort. The following figure describes the workflows needed to authenticate your Transfer Family server, upload a file and invoke the managed workflow, and regularly update your signature definitions.

Architecture depicting a way to detect threats while receiving files using AWS TransferFamily

Figure 1: Detect threats using Transfer Family – Architecture

  1. The user sends an authentication request to the Transfer Family server, which forwards the request to authenticate the user using a custom identity provider.
  2. The user uploads the files to the Transfer Family server. Each file is put into an S3 bucket and invokes a distinct workflow execution.
  3. The Transfer Family managed workflow initializes a sequence of processing steps you have configured. In the workflow step, the Lambda function scans each file with a ClamAV installed container image.
  4. Based on the scan result from the Lambda function, the managed workflow tags the files appropriately either as INFECTED or CLEAN
  5. An Amazon EventBridge scheduler rule is configured to run based on a cron expression to update the ClamAV image and virus definitions.
  6. AWS CodeBuild builds the container image, adds the latest ClamAV virus definitions, and uploads to Amazon Elastic Container Registry (Amazon ECR).
  7. The Lambda function pulls the built container image from Amazon ECR and updates the Lambda function part of the managed workflow.

This solution has multiple steps (including, custom Lambda function and tag) as part of the Transfer Family managed workflow for the same event. In case of state changes happening in any of the steps, additional modifications are needed in the subsequent steps. For example, if you have a custom step to decompress a file and output it to a new Amazon S3 prefix, then you must change the ClamAV image to point to the correct path.

Prerequisites

This post assumes you have a foundational understanding of the following AWS services:

For instructions, refer to Creating an AWS account.

Solution deployment

In this section, you deploy the CloudFormation templates that create the following resources:

  • S3 bucket
  • Transfer Family server
  • Lambda functions
  • AWS Secrets Manager secrets
  • AWS CodeBuild project
  • Amazon ECR repository
  • EventBridge rules
  • AWS Identity and Access Management (IAM) Roles and Policies

To deploy the CloudFormation template, follow these steps:

  1. Open AWS CloudShell in your AWS account.
  2. Clone this post’s GitHub repository using git clone command (git clone: https://github.com/aws-samples/transfer-family-anti-virus-cdk.git)

Screenshot depicting the git clone command.

Figure 2: Screenshot of git clone command in AWS CloudShell

  1. Change directory into the “transfer-family-anti-virus-cdk” folder (cd transfer-family-anti-virus-cdk).
  2. Provide executable permissions to deployStack.sh bash script (chmod +x deployStack.sh).
  3. Run the deployStack bash script with the USER_NAME as an argument to create the required resources (./deployStack.sh $USER_NAME).

Screenshot depicting the instructions to change directory, change access modifications for the deployStack shell script and running the deployStack shell script

Figure 3: Screenshot of running the deployStack script

  1. Copy the SFTPEndpoint from the output and note the user name from the previous step to use later. SFTPEndpoint is the fully qualified domain name of your Transfer Family server.

Screenshot depicting the SFTP Endpoint output at the successful completion of the deployStack script

Figure 4: Screenshot showing the SFTP endpoint output

  1. Retrieve the password generated and stored in the AWS Secrets Manager secret named SFTP/$USER_NAME to use later.

AWS Secret Manager retrieved secret value containing role, home directory, username and Password

Figure 5: Screenshot showing the username and generated password for use

The script takes less than 20 minutes to run and create the necessary resources for the solution.

Test the configuration

You can test the end-to-end configuration by following these steps:

  1. Uploading a clean file through the SFTP endpoint, user name, and password from Steps 6 and 7 in the ‘Deploying the solution’ section.
  2. In a few seconds, the managed transfer workflow is executed and the S3 object in the clamav-scan-landingzone-* bucket is tagged as CLEAN. It’s now accessible for download.

Virus scan result

Figure 6: Screen capture of the Amazon S3 Object tag for a clean file

  1. Download an anti-malware test file from eicar.org, a test file developed by the European Institute for Compute Anti-virus Research. Note that you must adhere to your organization’s information security best practices and guidelines. Make sure you carefully read and understand the terms of use for the test file before downloading.
  2. Upload the anti-malware test file through the SFTP endpoint, user name, and password from Steps 6 and 7 in the ‘Deploying the solution’ section.
  3. The managed transfer workflow is executed and custom preprocessing using Lambda function scans the uploaded file for malware.

Screen capture indicating the logs generated for an infected file in Amazon CloudWatch

Figure 7: Screen capture of the CloudWatch Logs for an infected file

  1. The Amazon S3 object in clamav-scan-landingzone-* bucket is tagged as INFECTED. The file is not available for download, as the solution denies the download of infected objects.

Screen capture indicating the Virus scan result is Infected

Figure 8: Screen capture of the Amazon S3 Object tag for an infected file

  1. (Optional) Edit the EventBridge rule (clamav-codebuild-cronEventRule-*) configured to refresh the ClamAV code and signatures. Note that you should troubleshoot and monitor issues using the logs generated in CloudWatch.

Screen grab showing the edit capability of the Amazon EventBridge scheduled standard rule

Figure 9: Screen capture of editing an EventBridge scheduled standard rule

Cleaning up

In this post, you created several components that have a cost. To avoid incurring future charges, remove the resources with the following steps:

  1. Empty the S3 bucket.
  2. Delete the ECR repository.
  3. Delete the CloudFormation stack.
  4. Remove the cloned repository.

Conclusion

In this post, we demonstrated how to automate ingesting and scanning content using AWS Transfer Family and managed workflows. We covered integrating a serverless continuous integration/continuous development (CI/CD) system to keep the scanning system up-to-date with the most recent signature files.

B2B communication includes exchanging files over secure protocols like SFTP, FTPS, and FTP. These capabilities let you quickly integrate open-source tools like ClamAV as a means to continuously monitor for threats, isolate malicious content, and periodically update signature definitions. The solution helps detect and respond to threats that could be encountered by malicious content shared over SFTP, FTPS, and FTP. This is important in the effort to protect the integrity of your data and establishing preventive measures to protect against these malicious events to help avoid reputational damage, financial loss, and other unfavorable scenarios.

You can deploy the complete solution into your account by following the steps mentioned in the “Deploying the solution” section. More advanced users can deploy this post’s GitHub repository using AWS Cloud Development Kit (AWS CDK). For more information about Transfer Family, visit the AWS Transfer Family product page.

Nate Bachmeier

Nate Bachmeier

Nate Bachmeier is an AWS Senior Solutions Architect that nomadically explores New York, one cloud integration at a time. He specializes in migrating and modernizing applications. Besides this, Nate is a full-time student and has two kids.

Satish Patil

Satish Patil

Satish is a Senior Solution Architect based out of Texas. He is currently working with AWS partners in the WWPS focused on migrations.

Pranjali Dani

Pranjali Dani

Pranjali is an Enterprise Solution Architect based in the US West region. She specializes in AWS IoT services and Migration services. She is very active in Women@AWS initiatives and is part of several related programs encouraging women and the young generation to take an active role in STEM/Engineering. She has two little girls and loves volunteering in her free time.

Ramasamy Seranthaiya

Ramasamy Seranthaiya

Ramasamy is an Enterprise Solutions Architect based out of Philadelphia. He enables AWS customers to achieve business outcomes with AWS. He is an ardent cricket follower and spends most of his free time on cricket-related activities.