We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.
If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”
Customize cookie preferences
We use cookies and similar tools (collectively, "cookies") for the following purposes.
Essential
Essential cookies are necessary to provide our site and services and cannot be deactivated. They are usually set in response to your actions on the site, such as setting your privacy preferences, signing in, or filling in forms.
Performance
Performance cookies provide anonymous statistics about how customers navigate our site so we can improve site experience and performance. Approved third parties may perform analytics on our behalf, but they cannot use the data for their own purposes.
Allowed
Functional
Functional cookies help us provide useful site features, remember your preferences, and display relevant content. Approved third parties may set these cookies to provide certain site features. If you do not allow these cookies, then some or all of these services may not function properly.
Allowed
Advertising
Advertising cookies may be set through our site by us or our advertising partners and help us deliver relevant marketing content. If you do not allow these cookies, you will experience less relevant advertising.
Allowed
Blocking some types of cookies may impact your experience of our sites. You may review and change your choices at any time by selecting Cookie preferences in the footer of this site. We and selected third-parties use cookies or similar technologies as specified in the AWS Cookie Notice.
Your privacy choices
We display ads relevant to your interests on AWS sites and on other properties, including cross-context behavioral advertising. Cross-context behavioral advertising uses data from one site or app to advertise to you on a different company’s site or app.
To not allow AWS cross-context behavioral advertising based on cookies or similar technologies, select “Don't allow” and “Save privacy choices” below, or visit an AWS site with a legally-recognized decline signal enabled, such as the Global Privacy Control. If you delete your cookies or visit this site from a different browser or device, you will need to make your selection again. For more information about cookies and how we use them, please read our AWS Cookie Notice.
Scalable Analytics Using Apache Druid on AWS is an AWS Solution that allows you to quickly and efficiently set up, operate, and manage Apache Druid on AWS, a cost-effective, highly available, resilient, and fault tolerant hosting environment. With this solution, you can use the full suite of features and capabilities of Apache Druid, while optimizing the elasticity, scalability, and the flexible pricing for compute and storage offerings on AWS.
Benefits
Easy deployment of Druid clusters to AWS accounts in minutes
Gain the flexibility to customize installations using your choice of AWS compute engine and storage from a variety of instance and serverless options.
Specify an identity provider to authenticate users through the OpenID Connect protocol, use the solution’s out-of-the-box support for Lightweight Directory Access Protocol (LDAP), or configure basic authentication settings such as username and password.
Built-in logging and monitoring with Amazon CloudWatch
Use log entries, emitted by Druid, to a centralized Amazon CloudWatch log group to facilitate debugging and troubleshooting activities, set up a monitoring dashboard to track the health of the Druid cluster, and configure alarms based on customer preferences.
Easy integration with Druid extensions
Install and configure this solution with native support for loading Druid extensions, including core and community extensions.
Druid query Auto scaling group: An auto scaling group contains a collection of Druid query servers. A query server provides the endpoints that users and client applications interact with, routing queries to data servers or other query servers. Within a query server, functionality is split between two processes; the Broker and Router.
ZooKeeper Auto scaling group: An auto scaling group contains a collection of ZooKeeper servers. Apache Druid uses Apache ZooKeeper (ZK) for management of current cluster state.
Step 5 An Amazon Simple Storage Service (S3) bucket provides deep storage for the Apache Druid cluster. Deep storage is the location where the segments are stored.
Step 6 AWS Secrets Manager stores the secrets used by Apache Druid including the Amazon Relational Database Service (RDS) secret, and the administrator user secret. It also stores the credentials for the system account the Druid components use to authenticate with each other.
Step 8 An Amazon Aurora PostgreSQL database provides the metadata storage for the Apache Druid cluster. Druid uses the metadata store to house only metadata about the system and does not store the actual data.
Step 9 The notification system, powered by Amazon Simple Notification Service (Amazon SNS), delivers alerts or alarms promptly when system events occur. This ensures immediate awareness and action when needed.
Step 1 AWS WAF to protect the Druid web console and Druid API endpoints against common web exploits and bots that may affect availability, compromise security, or consume excessive resources. AWS WAF is only provisioned and deployed for internet facing clusters.
Step 2 A security hardened Linux server (Bastion host) to manage access to the Druid servers running in a private network separate from an external network. It can also be used to access the Druid web console through SSH tunneling where a private Application Load Balancer (ALB) is deployed.
Step 3 An ALB serves as the single point of contact for clients. The load balancer distributes incoming application traffic across multiple query servers in multiple Availability Zones.
Step 4 The private subnet consists of the following:
Druid master Auto scaling group: An auto scaling group contains a collection of Druid master servers. A master server manages data ingestion and availability and is responsible for starting new ingestion jobs and coordinating availability of data on the "Data servers". Within a master server, functionality is split between two processes; the Coordinator and Overlord.
Druid data Auto scaling group: An auto scaling group contains a collection of Druid data servers. A data server runs ingestion jobs and stores queryable data. Within a data server, functionality is split between two processes; the Historical and MiddleManager.
Druid query Auto scaling group: An auto scaling group contains a collection of Druid query servers. A query server provides the endpoints that users and client applications interact with, routing queries to data servers or other query servers. Within a query server, functionality is split between two processes; the Broker and Router.
ZooKeeper Auto scaling group: An auto scaling group contains a collection of ZooKeeper servers. Apache Druid uses Apache ZooKeeper (ZK) for management of current cluster state.
Step 5 An Amazon Simple Storage Service (S3) bucket provides deep storage for the Apache Druid cluster. Deep storage is the location where the segments are stored.
Step 6 AWS Secrets Manager stores the secrets used by Apache Druid including the Amazon Relational Database Service (RDS) secret, and the administrator user secret. It also stores the credentials for the system account the Druid components use to authenticate with each other.
Step 8 An Amazon Aurora PostgreSQL database provides the metadata storage for the Apache Druid cluster. Druid uses the metadata store to house only metadata about the system and does not store the actual data.
Step 9 The notification system, powered by Amazon Simple Notification Service (Amazon SNS), delivers alerts or alarms promptly when system events occur. This ensures immediate awareness and action when needed.
Step 1 AWS WAF to protect the Druid web console and Druid API endpoints against common web exploits and bots that may affect availability, compromise security, or consume excessive resources. AWS WAF is only provisioned and deployed for internet facing clusters.
Step 2 A security hardened Linux server (Bastion host) to manage access to the Druid servers running in a private network separate from an external network. It can also be used to access the Druid web console through SSH tunneling where a private Application Load Balancer (ALB) is deployed.
Step 3 An ALB serves as the single point of contact for clients. The load balancer distributes incoming application traffic across multiple query servers in multiple Availability Zones.
Step 4 The private subnet consists of the following:
Druid master Auto scaling group: An auto scaling group contains a collection of Druid master servers. A master server manages data ingestion and availability and is responsible for starting new ingestion jobs and coordinating availability of data on the "Data servers". Within a master server, functionality is split between two processes; the Coordinator and Overlord.
Druid data Auto scaling group: An auto scaling group contains a collection of Druid data servers. A data server runs ingestion jobs and stores queryable data. Within a data server, functionality is split between two processes; the Historical and MiddleManager.
Druid query Auto scaling group: An auto scaling group contains a collection of Druid query servers. A query server provides the endpoints that users and client applications interact with, routing queries to data servers or other query servers. Within a query server, functionality is split between two processes; the Broker and Router.
Druid query Auto scaling group: An auto scaling group contains a collection of Druid query servers. A query server provides the endpoints that users and client applications interact with, routing queries to data servers or other query servers. Within a query server, functionality is split between two processes; the Broker and Router.
ZooKeeper Auto scaling group: An auto scaling group contains a collection of ZooKeeper servers. Apache Druid uses Apache ZooKeeper (ZK) for management of current cluster state.
Step 5 An Amazon Simple Storage Service (S3) bucket provides deep storage for the Apache Druid cluster. Deep storage is the location where the segments are stored.
Step 6 AWS Secrets Manager stores the secrets used by Apache Druid including the Amazon Relational Database Service (RDS) secret, and the administrator user secret. It also stores the credentials for the system account the Druid components use to authenticate with each other.
Step 8 An Amazon Aurora PostgreSQL database provides the metadata storage for the Apache Druid cluster. Druid uses the metadata store to house only metadata about the system and does not store the actual data.
Step 9 The notification system, powered by Amazon Simple Notification Service (Amazon SNS), delivers alerts or alarms promptly when system events occur. This ensures immediate awareness and action when needed.
Step 1 AWS WAF to protect the Druid web console and Druid API endpoints against common web exploits and bots that may affect availability, compromise security, or consume excessive resources. AWS WAF is only provisioned and deployed for internet facing clusters.
Step 2 A security hardened Linux server (Bastion host) to manage access to the Druid servers running in a private network separate from an external network. It can also be used to access the Druid web console through SSH tunneling where a private Application Load Balancer (ALB) is deployed.
Step 3 An ALB serves as the single point of contact for clients. The load balancer distributes incoming application traffic across multiple query servers in multiple Availability Zones.
Step 4 The private subnet consists of the following:
Druid master Auto scaling group: An auto scaling group contains a collection of Druid master servers. A master server manages data ingestion and availability and is responsible for starting new ingestion jobs and coordinating availability of data on the "Data servers". Within a master server, functionality is split between two processes; the Coordinator and Overlord.
Druid data Auto scaling group: An auto scaling group contains a collection of Druid data servers. A data server runs ingestion jobs and stores queryable data. Within a data server, functionality is split between two processes; the Historical and MiddleManager.
Druid query Auto scaling group: An auto scaling group contains a collection of Druid query servers. A query server provides the endpoints that users and client applications interact with, routing queries to data servers or other query servers. Within a query server, functionality is split between two processes; the Broker and Router.
ZooKeeper Auto scaling group: An auto scaling group contains a collection of ZooKeeper servers. Apache Druid uses Apache ZooKeeper (ZK) for management of current cluster state.
Step 5 An Amazon Simple Storage Service (S3) bucket provides deep storage for the Apache Druid cluster. Deep storage is the location where the segments are stored.
Step 6 AWS Secrets Manager stores the secrets used by Apache Druid including the Amazon Relational Database Service (RDS) secret, and the administrator user secret. It also stores the credentials for the system account the Druid components use to authenticate with each other.
Step 8 An Amazon Aurora PostgreSQL database provides the metadata storage for the Apache Druid cluster. Druid uses the metadata store to house only metadata about the system and does not store the actual data.
Step 9 The notification system, powered by Amazon Simple Notification Service (Amazon SNS), delivers alerts or alarms promptly when system events occur. This ensures immediate awareness and action when needed.
Step 1 AWS WAF to protect the Druid web console and Druid API endpoints against common web exploits and bots that may affect availability, compromise security, or consume excessive resources. AWS WAF is only provisioned and deployed for internet facing clusters.
Step 2 A security hardened Linux server (Bastion host) to manage access to the Druid servers running in a private network separate from an external network. It can also be used to access the Druid web console through SSH tunneling where a private Application Load Balancer (ALB) is deployed.
Step 3 An ALB serves as the single point of contact for clients. The load balancer distributes incoming application traffic across multiple query servers in multiple Availability Zones.
Step 4 The private subnet consists of the following:
Druid master Auto scaling group: An auto scaling group contains a collection of Druid master servers. A master server manages data ingestion and availability and is responsible for starting new ingestion jobs and coordinating availability of data on the "Data servers". Within a master server, functionality is split between two processes; the Coordinator and Overlord.
Druid data Auto scaling group: An auto scaling group contains a collection of Druid data servers. A data server runs ingestion jobs and stores queryable data. Within a data server, functionality is split between two processes; the Historical and MiddleManager.
Druid query Auto scaling group: An auto scaling group contains a collection of Druid query servers. A query server provides the endpoints that users and client applications interact with, routing queries to data servers or other query servers. Within a query server, functionality is split between two processes; the Broker and Router.