reference deployment

ClickHouse Cluster on AWS

An open-source, column-oriented database management system

This solution deploys a ClickHouse cluster on the Amazon Web Services (AWS) Cloud. ClickHouse is an open-source, column-oriented database management system (DBMS), which can be used for online analytical processing (OLAP) of queries.

This deployment is for customers who want to process anaytical queries using a DBMS, such as MySQL, PostgreSQL, and Oracle Database. During the deployment, customers can configure the AWS CloudFormation templates to define the desired cluster nodes and settings.

duo logo

This solution was developed by AWS.

  •  What you'll build
  • This solution sets up the following:

    • A highly available architecture that spans two Availability Zones.*
    • A virtual private cloud (VPC) configured with public and private subnets, according to AWS best practices, to provide you with your own virtual network on AWS.*
    • An internet gateway to allow internet access for bastion hosts.*
    • In the public subnets:
      • Managed network address translation (NAT) gateways to allow outbound internet access for resources in the private subnets.*
      • A Linux bastion host in an Auto Scaling group to allow inbound Secure Shell (SSH) access to Amazon Elastic Compute Cloud (Amazon EC2) instances in public and private subnets.*
    • In the private subnets:
      • A ClickHouse client in an Auto Scaling group to allow administrators to connect to the ClickHouse cluster.
      • A ClickHouse database cluster that contains Amazon EC2 instances.
      • A ZooKeeper cluster that contains Amazon EC2 instances for storing metadata for ClickHouse replication. Each replica stores its state in ZooKeeper as the set of parts and its checksums.
    • Elastic Load Balancing for the ClickHouse cluster.
    • An Amazon Simple Storage Service (Amazon S3) bucket for tiered storage of the ClickHouse cluster.
    • Amazon CloudWatch Logs to centralize ClickHouse logs and modify the log-retention policy.
    • Amazon Simple Notification Service (Amazon SNS) for sending email notifications when an alarm triggers.
    • AWS Secrets Manager to store dynamically generated passwords.

    *  The template that deploys the solution into an existing VPC skips the components marked by asterisks and prompts you for your existing VPC configuration.

  •  How to deploy
  • To deploy a ClickHouse cluster on AWS, follow the instructions in the deployment guide. The deployment process takes about 60 minutes and includes these steps:

    1. Sign in to your AWS account. If you don't have an account, sign up at https://aws.amazon.com.
    2. Launch the solution. Before you create the stack, choose the AWS Region from the top toolbar. You can choose from two options:
    3. Test your deployment.

    Amazon may share user-deployment information with the AWS Partner that collaborated with AWS on this solution.  

  •  Costs and licenses
  • You are responsible for the cost of the AWS services and any third-party licenses used while running this solution reference deployment. There is no additional cost for using the solution.

    The AWS CloudFormation templates for solutions include configuration parameters that you can customize. Some of the settings, such as the instance type, affect the cost of deployment. For cost estimates, refer to the pricing pages for each AWS service you use. Prices are subject to change.

    You are responsible for the cost of the AWS services and any third-party licenses used while running this solution. There is no additional cost for using the solution.

    This solution includes configuration parameters that you can customize. Some of these settings, such as instance type, affect the cost of deployment. For cost estimates, refer to the pricing pages for each AWS service you use. Prices are subject to change.

    Tip: After you deploy a solution, create AWS Cost and Usage Reports to track associated costs. These reports deliver billing metrics to an Amazon Simple Storage Service (Amazon S3) bucket in your account. They provide cost estimates based on usage throughout each month and aggregate the data at the end of the month. For more information, refer to What are AWS Cost and Usage Reports?