reference deployment

Informatica Enterprise Data Catalog on AWS

Enterprise-wide data asset discovery, analytics, and governance on the AWS Cloud

This Quick Start deploys Enterprise Data Catalog from Informatica on the AWS Cloud. Enterprise Data Catalog helps you discover and catalog assets along with their relationships from data sources across your enterprise. Assets are data objects such as tables, columns, reports, views, and schemas, and might exist in relational databases, purpose-built applications, ETL or business intelligence tools, and big data systems.

Enterprise Data Catalog scans data sources to capture physical and operational metadata for your data assets (for example, column data statistics, data domains, data object relationships, data lineage information), which can help you make critical decisions on data integration, data quality, and data governance in the enterprise.

informatica-logo-2017

This Quick Start was developed by Informatica in collaboration with AWS. Informatica is an
APN Partner.

  •  What you'll build
  •  How to deploy
  •  Cost and licenses
  •  What you'll build
  • Use this Quick Start to set up the following environment on AWS:

    • A virtual private cloud (VPC) configured across two Availability Zones with public and private subnets, to provide the network infrastructure for your Enterprise Data Catalog deployment.*
    • An internet gateway to provide access to the internet, and managed network address translation (NAT) gateways configured with an Elastic IP address for outbound internet connectivity.*
    • An IAM role with fine-grained permissions for access to AWS services necessary for the deployment process, and appropriate security groups to restrict access to only necessary protocols and ports.
    • In the public subnets, EC2 instances for Enterprise Data Catalog, including a configurable single-node or multi-node, embedded cluster, scanners for extracting metadata, and Informatica services for data integration, cataloging, profiling, and analysis.
    • In the private subnets, Informatica domain and repository databases hosted on Amazon RDS using Microsoft SQL Server. The domain database manages the service-oriented architecture (SOA) namespace, and the repository database holds all the metadata about objects.

    * The template that deploys the Quick Start into an existing VPC skips the tasks marked by asterisks and prompts you for your existing VPC configuration.

  •  How to deploy
  • Deploy Enterprise Data Catalog on AWS in a few simple steps:

    1. If you don't already have an AWS account, sign up at https://aws.amazon.com. If you're using your existing account, you might need to request service limit increases for the EC2 instance type, Elastic IP addresses, and other AWS resources you'll be using in this deployment.
    2. Place your Enterprise Data Catalog license key file in an S3 bucket. To sign up for a demo license, contact Informatica.
    3. Launch the Quick Start. The deployment takes approximately two to three hours. You can choose from two options:

    To customize your deployment, you can choose different instance types for your resources and configure the size of the Informatica embedded cluster. You can also choose to import sample catalog data to start using Enterprise Data Catalog on AWS.

  •  Cost and licenses
  • You are responsible for the cost of the AWS services used while running this Quick Start reference deployment. There is no additional cost for using the Quick Start.

    The AWS CloudFormation template for this Quick Start includes configuration parameters that you can customize. Some of these settings, such as instance type, will affect the cost of deployment. See the pricing pages for each AWS service you will be using for cost estimates.

    This Quick Start requires a license for Informatica Enterprise Data Catalog. To sign up for a demo license, please contact Informatica.