Posted On: Jan 25, 2018

This Quick Start reference deployment automatically deploys the Informatica Data Lake Management solution on the Amazon Web Services (AWS) Cloud. 

A data lake uses a single, Hadoop-based data repository that you create to manage data supply and demand. Informatica’s solution integrates, organizes, administers, governs, and secures large volumes of structured and unstructured data. It delivers actionable and reliable information for business insights. 

Deploying this Quick Start builds the Informatica Data Lake environment and embeds Hadoop clusters in the virtual private cloud (VPC) for metadata storage and processing. 

The deployment includes an Amazon EMR cluster for Hadoop Distributed File System (HDFS) and Hive, and sets up Amazon Simple Storage Service (Amazon S3) and Amazon Redshift environments for data scanning. The Informatica domain and repository database are hosted on Amazon Relational Database Service (Amazon RDS) using Oracle. 

AWS CloudFormation templates automate the deployment and provide customization options for network resources, Informatica settings, and AWS services.  

To get started, use the following resources:

About Quick Starts
Quick Starts are automated reference deployments that use AWS CloudFormation templates to deploy key technologies on AWS, following AWS best practices. This is the latest in a set of AWS customer-ready solutions, which are ready-to-deploy reference architectures and best practices that address specific use cases or business processes.