reference deployment

Matillion ETL on AWS

Automate data loads and transformations

This Quick Start deploys Matillion ETL for Amazon Redshift on the Amazon Web Services (AWS) Cloud in either a single instance with an Amazon Aurora database or a high availability (HA) cluster, following AWS best practices.

Matillion ETL for Amazon Redshift is an extract, transform, and load/extract, load, and transform (ETL/ELT) tool that automates data loads and transformations for greater speed, scale, and savings in the enterprise.

You can load data into Amazon Redshift from data sources such as cloud and on-premises databases, cloud and software as a service (SaaS) applications, application programming interfaces (APIs), files, and NoSQL databases. After your data is available in Amazon Redshift, you can combine components in Matillion ETL to build complex data transformations for visualizations, business intelligence, reporting, and advanced analytics.  

matillion-etl-logo

This Quick Start was developed by Matillion in collaboration with AWS. Matillion is an
APN Partner.  

  •  What you'll build
  •  How to deploy
  •  Cost and licenses
  •  What you'll build
  • Single-instance architecture
    Matillion ETL architecture on the AWS Cloud - single instance

    Use this Quick Start to automatically set up the following environment on AWS:

    • A highly available architecture that spans two Availability Zones.*
    • A virtual private cloud (VPC) configured with public and private subnets according to AWS best practices, to provide you with your own virtual network on AWS.*
    • In the public subnet, a single Amazon Elastic Compute Cloud (Amazon EC2) instance running Matillion ETL.
    • An AWS Identity and Access Management (IAM) role attached to the EC2 instance to specify which AWS services the Matillion ETL instance can access.
    • In the private subnets, Amazon Aurora to use as the Matillion ETL metadata repository.
    • In a private subnet, an Amazon Redshift database, in which data is loaded from Matillion ETL for Amazon Redshift.*
    • Amazon CloudWatch–based logging to monitor status of the Matillion ETL server.
    • Amazon Simple Notification Service (Amazon SNS) to send Amazon CloudWatch alarm and event notifications.
    • AWS Key Management Service (AWS KMS) for encrypting the Amazon Redshift database at rest.
    Cluster architecture with high availability
    Matillion ETL architecture on the AWS Cloud - high availability

    Use this Quick Start to automatically set up the following environment on AWS:

    • A highly available architecture that spans two Availability Zones.*
    • A VPC configured with public and private subnets according to AWS best practices, to provide you with your own virtual network on AWS.*
    • In the public subnets, an Amazon EC2 instance running Matillion ETL in a cluster, deployed across two Availability Zones.
    • An Application Load Balancer to direct traffic to the Matillion ETL instances.
    • An IAM role attached to the EC2 instance to specify which AWS services the Matillion ETL instance can access.
    • In the private subnets, Amazon Aurora, to use as the Matillion ETL metadata repository.
    • In a private subnet, an Amazon Redshift database, in which data is loaded from Matillion ETL for Amazon Redshift.*
    • Amazon CloudWatch–based logging to monitor the status of the Matillion ETL server.
    • Amazon SNS to send Amazon CloudWatch alarm and event notifications.
    • AWS KMS for encrypting the Amazon Redshift database at rest.

    *  The template that deploys the Quick Start into an existing VPC skips the components marked by asterisks and prompts you for your existing VPC configuration.

     See the source code for this Quick Start
  •  How to deploy
  • To deploy the Matillion ETL environment in your AWS account, follow the instructions in the deployment guide. The deployment process takes about 20 minutes and includes these steps:

    1. If you don't already have an AWS account, sign up at https://aws.amazon.com, and sign in to your account.
    2. Subscribe to the Matillion ETL for Amazon Redshift Amazon Machine Image (AMI) in AWS Marketplace.
    3. Launch the Quick Start. You can choose from these options:
    4. Test the deployment.

    Please know that we may share who uses AWS Quick Starts with the AWS partner that collaborated with AWS on the content of the Quick Start.

  •  Cost and licenses
  • You are responsible for the cost of the AWS services used while running this Quick Start reference deployment. There is no additional cost for using the Quick Start.

    The AWS CloudFormation template for this Quick Start includes configuration parameters that you can customize. Some of these settings, such as instance type, will affect the cost of deployment. For cost estimates, see the pricing pages for each AWS service you will be using. Prices are subject to change.

    Tip    After you deploy the Quick Start, we recommend that you enable the AWS Cost and Usage Report to track costs associated with the Quick Start. This report delivers billing metrics to an S3 bucket in your account. It provides cost estimates based on usage throughout each month, and finalizes the data at the end of the month. For more information about the report, see the AWS documentation.

    The Quick Start requires a subscription to the AMI for Matillion ETL for Amazon Redshift, which is available from AWS Marketplace. Additional pricing, terms, and conditions may apply.