Overview
The AWS Glue Connector for Elasticsearch helps you read from and write to Elasticsearch using Apache Spark. By using this connector, you can focus on mining meaningful business insights from your data instead of writing and maintaining the connecting logic. For more details, please refer to the guidance: https://docs.aws.amazon.com/glue/latest/ug/tutorial-elastisearch-connector.html . For more details about this open-source Elasticsearch spark connector, please refer to this open-source connector online reference: https://www.elastic.co/guide/en/elasticsearch/hadoop/current/spark.html .
Highlights
- * Connect to Elasticsearch from AWS Glue Jobs * Simplify data extracts from Elasticsearch * Simplify data loads to Elasticsearch
Details
Features and programs
Financing for AWS Marketplace purchases
Pricing
Vendor refund policy
We do not currently support refunds (you can cancel at any time)
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Glue 3.0
- Amazon ECS
- Amazon EKS
Container image
Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.
Version release notes
Elasticsearch Connector for AWS Glue 7.13.4.
- This version is built with elasticsearch-spark 7.13.4.
- This version is compatible with AWS Glue 3.0, 2.0, and 1.0.
- This version supports both read from and write to Elasticsearch.
Additional details
Usage instructions
Please subscribe to the product from AWS Marketplace and Activate the Glue connector from AWS Glue Studio .
Pre-requisite
- Elasticsearch domain
- For Amazon OpenSearch Service, enable compatibility mode.
- Network reachability from the Glue job to Elasticsearch domain
- For Amazon Elasticsearch Service users, you need to allow access from Glue jobs in the policy.
- If you have your Elasticsearch domain with VPC access, you need to attach a Glue Network connection with VPC configuration to the Glue job.
- If you have your Elasticsearch domain with public access and you want to allow access only from Glue jobs, you can add NAT Gateway in your subnet, and attach a Glue Network Connection using this subnet to the Glue job.
- For Amazon Elasticsearch Service users, you need to allow access from Glue jobs in the policy.
- AWS Secrets Manager secret (you can create the secret in following steps)
Create a new secret for Elasticsearch in AWS Secrets Manager
Create a secret in AWS Secrets Manager to store username/password.
- On the Secrets Manager console, choose Store a new secret.
- For Secret type, select Other type of secret.
- Enter your key as es.net.http.auth.user and the value as your Elasticsearch username.
- Enter your key as es.net.http.auth.pass and the value as your Elasticsearch password.
- Leave the rest of the options at their default.
- Choose Next.
- Give a name to the secret elasticsearch_credentials.
- Follow through the rest of the steps to store the secret.
Connection options
You can pass the following options to the connector.
- path (required): The Elasticsearch path in the format index/type.
- es.nodes (required): The Elasticsearch HTTPS endpoint.
- es.port(required): The Elasticsearch port number. For Amazon Elasticsearch Service, choose 443.
- es.nodes.wan.only (optional): Whether the connector is used against an Elasticsearch instance in a cloud/restricted environment over the WAN. For Amazon Elasticsearch Service, choose true.
See other available options here: https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html
References
For more details, please refer to Glue Elasticsearch connector guidance: https://docs.aws.amazon.com/glue/latest/ug/tutorial-elastisearch-connector.html . For more details about this open-source Elasticsearch spark connector, please refer to this open-source connector online reference: https://www.elastic.co/guide/en/elasticsearch/hadoop/current/spark.html .
Support
Vendor support
Please allow 24 hours
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.