AWS Architecture Blog

Building a Cloud-Native File Transfer Platform Using AWS Transfer Family Workflows

October 2023: SFTP connectors have been released; therefore, Scenario 2, Step D has been updated.


File-based transfers are one of the most prevalent mechanisms for organizations to exchange data over various interfaces with their partners and consumers. There are specialized third-party managed file transfer (MFT) products available in the market that provide rich workflows for managing these transfers.

A typical MFT platform provides features to perform a series of linked pre- and post-file upload processing steps. The new managed workflows feature within AWS Transfer Family allows you to define a lightweight workflow that is invoked in response to file uploads. This feature, combined with the core SFTP, FTPS, and FTP functionality, enables you to build a cloud-native MFT platform for your organization. The workflows are also integrated with Amazon CloudWatch to provide complete traceability.

Before this feature was released, the MFT architecture based on Transfer Family involved responding to Amazon Simple Storage Service (Amazon S3) events within AWS Lambda functions. There was no overarching orchestration layer. With the new managed workflows feature, the sequencing of steps and error handling is greatly simplified.

In this blog, I show you how to architect common MFT scenarios using the new Transfer Family managed workflows feature. This will help you build a robust and well-integrated cloud-native MFT platform.

Scenario 1: Inbound Flow – file push by external providers

In this scenario, a file is supplied by an external data provider. It must be decrypted, checked for errors, and transferred to an internal application area (Amazon S3 bucket) for further processing by an application.

The internal application that processes the file could be an in-house Java application, an Enterprise Resourcing Planning system that processes payments, telecommunication billing system that consumes call data, or even financial regulatory organization that scans daily share trading data for anomalies.

The architecture for this scenario is presented in Figure 1. Here’s how it works:

  1. The external data provider connects to the organization’s public Transfer Family endpoint and provides the authentication credentials.
  2. The service authenticates the user via the pre-configured authentication mechanism. This could be a custom identity provider, AWS Directory Service, or service managed.
  3. Once authenticated, the data provider uploads the file to a logical folder. This results in the file being stored in the underlying Upload S3 bucket.
  4. Transfer Family initiates the configured workflow once the file has been uploaded to the S3 bucket. The workflow performs the required pre-processing steps, including:
    • Invoking a Lambda function to decrypt the file.
    • Invoking a Lambda function to ensure the file data is valid.
    • Copying the file to the Application S3 bucket.
    • Deleting or archiving the file by copying it to another S3 bucket or storing it with a different S3 prefix.
    • If an error occurs, the workflow exception handler moves (copy and delete) the file to the Quarantine S3 bucket or stores it with a different S3 prefix.
MFT inbound flow – push by data provider

Figure 1. MFT inbound flow – push by data provider

Scenario 2: Outbound flow – file push to external consumers

In this scenario, an internal application generates files that are to be provided to external parties. Examples include submissions to credit check agencies, direct debits or payment files to banking institutions. These files must be re-formatted, encrypted, and transferred to an external SFTP site or an API endpoint.

The architecture for implementing this scenario is presented in Figure 2. Here’s how it works:

  1. An internal application connects to the organization’s Transfer Family’s private SFTP endpoint hosted within Amazon Virtual Private Cloud (Amazon VPC) and provides the authentication information.
  2. The service authenticates the application using the pre-configured authentication mechanism. This could be a custom identity provider, Directory Service, or service managed.
  3. Once authenticated, the application uploads the file to a logical folder. This results in the file being stored in the underlying Upload S3 bucket.
  4. Transfer Family initiates the configured workflow once the file has been uploaded to the S3 bucket. The workflow performs the required processing steps, including:
    • Invoking a Lambda function to reformat and encrypt the file.
    • Invoking SFTP connector to transfer the file to the external SFTP site or API endpoint using a custom Lambda function.
    • Copying the transferred file to the Processed S3 bucket or storing the file with a different Amazon S3 prefix.
    • Emptying the internal upload folder by deleting the file.
    • In case of errors, the workflow exception handler moves (copy and delete) the file to Error S3 bucket or stores with a different S3 prefix.
MFT outbound flow – push to data consumer

Figure 2. MFT outbound flow – push to data consumer

Scenario 3: Outbound Flow – file pull by external consumers

In this scenario, an internal application generates files that are to be provided to external parties. However, in this case, the files are downloaded or “pulled” from the external facing SFTP download folder by the consumers.

Examples include scenarios wherein external parties have a pre-defined schedule to download files or the consumers need to download the files manually in absence of an SFTP endpoint on their side.

The architecture for this scenario is presented in Figure 3. In this case, two instances of Transfer Family are created:

  1. Internal facing private instance from Scenario 2.
  2. External facing public instance to be used by the consumer for file downloads.

Here’s how it works:

Steps A through D. Flow remains the same as Scenario 2, except the internal workflow task uploads files to the S3 bucket underneath the external facing instance of Transfer Family.
E. The external consumer connects to the organization’s public Transfer Family endpoint and provides the authentication credentials.
F. The external facing Transfer Family service instance authenticates the consumer using the pre-configured authentication mechanism. This could be a custom identity provider, AWS Directory Service, or service managed.
G. Once authenticated, the data consumer downloads the file from the external Transfer Family SFTP server instance.

MFT outbound flow – pull by data consumer

Figure 3. MFT outbound flow – pull by data consumer

Conclusion

The new managed workflow feature within Transfer Family provides a simple mechanism to create file transfer flows. In this blog post, I showed you some of the common use cases you can implement using this new feature. You can combine this architecture approach with additional AWS services to build a robust and well-integrated cloud native managed file transfer platform.

Related information

Looking for more architecture content? AWS Architecture Center provides reference architecture diagrams, vetted architecture solutions, Well-Architected best practices, patterns, icons, and more!

Shoeb Bustani

Shoeb Bustani

Shoeb Bustani is a Senior Consultant - Migrations in ProServe at Amazon Web Services, based in the United Kingdom. He has over 20 years of industry experience in enterprise and solution architecture across both traditional and cloud IT. As a senior migration consultant, he provides architecture leadership and helps customers accelerate their cloud adoption journey through a variety of migration strategies and using AWS Well-Architected cloud native solutions.