We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.
If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”
Customize cookie preferences
We use cookies and similar tools (collectively, "cookies") for the following purposes.
Essential
Essential cookies are necessary to provide our site and services and cannot be deactivated. They are usually set in response to your actions on the site, such as setting your privacy preferences, signing in, or filling in forms.
Performance
Performance cookies provide anonymous statistics about how customers navigate our site so we can improve site experience and performance. Approved third parties may perform analytics on our behalf, but they cannot use the data for their own purposes.
Allowed
Functional
Functional cookies help us provide useful site features, remember your preferences, and display relevant content. Approved third parties may set these cookies to provide certain site features. If you do not allow these cookies, then some or all of these services may not function properly.
Allowed
Advertising
Advertising cookies may be set through our site by us or our advertising partners and help us deliver relevant marketing content. If you do not allow these cookies, you will experience less relevant advertising.
Allowed
Blocking some types of cookies may impact your experience of our sites. You may review and change your choices at any time by selecting Cookie preferences in the footer of this site. We and selected third-parties use cookies or similar technologies as specified in the AWS Cookie Notice.
Your privacy choices
We display ads relevant to your interests on AWS sites and on other properties, including cross-context behavioral advertising. Cross-context behavioral advertising uses data from one site or app to advertise to you on a different company’s site or app.
To not allow AWS cross-context behavioral advertising based on cookies or similar technologies, select “Don't allow” and “Save privacy choices” below, or visit an AWS site with a legally-recognized decline signal enabled, such as the Global Privacy Control. If you delete your cookies or visit this site from a different browser or device, you will need to make your selection again. For more information about cookies and how we use them, please read our AWS Cookie Notice.
The Google Cloud Storage Connector for AWS Glue simplifies the process of connecting AWS Glue jobs to extract data from Google Cloud Storage. This connector provides comprehensive access to Google Cloud Storage data, facilitating cloud ETL processes for operational reporting, data governance, and more.
Highlights
Connect to Google Cloud Storage from AWS Glue Jobs
AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
Pricing is based on a fixed monthly subscription cost. You pay the same amount each month for unlimited usage of the product. Pricing is prorated, so you're only charged for the number of days you've been subscribed. Subscriptions have no end date and may be canceled any time.
We'd like to hear your feedback and ideas on how to improve this page.
Legal
Vendor terms and conditions
Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA).
Content disclaimer
Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.
Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.
Version release notes
AWS Glue Connector for Google Cloud Storage reads data from Google Cloud Storage.
What is the Google Cloud Storage connector for AWS Glue?
The Google Cloud Storage Glue Connector for AWS Glue simplifies the process of connecting AWS Glue jobs to extract data from Google Cloud Storage. This connector provides comprehensive access to Google Cloud Storage data, facilitating cloud ETL processes for operational reporting, data governance.
Use the Google Cloud Storage Glue connector
To use the Google Cloud Storage Glue connector in your Glue ETL job, you need to activate the connector first and set connector options in the job. This part shows what options you can set to the connector and how to use it in your Glue ETL job.
Connector options you need to set
secret (required) the secret name which will contains GSP service account credentials.
Database (required) the database from Glue catalog.
table (required) the table name from Glue catalog you want to pull data(case sensitive).
The recommended approach to setting the connector options
Use AWS Secrets Manager can be used to store username, password and other sensitive information related to source connection.
The IAM Role you will be using in the Glue Job should contain the following policies
AWS Glue Service - To run the job.
AWS Glue Catalog - To access database and table from Glue Catalog
Amazon EC2 Container Registry - To access AWS Container Registry.
Secrets Manager (optional) - If you use AWS Secrets Manager for connection options.
Using the Google Cloud Storage Connector for AWS Glue
Here are the setup steps for configuring the Google Cloud Storage Connector:
Step 1: Setup IAM Role for policies and secret AWS Secrets Manager
Create database and table in AWS Glue Catalog:
Create table with location as gcs location.
e.g. gs://bucket/folder
Add Schema with partition column if applicable. (Note: partition column should only be type varchar or String.)
edit and add partition.pattern as table property in case of partitions
e.g. partition.pattern : year=${year_column_name}/month${month_column_name}/
Step 2: Setup IAM Role for policies and secret AWS Secrets Manager
Create role and secret:
IAM Role for policies as described above.
Optional - Create secret in AWS Secrets Manager for the connector options described above.
Step 3: Setup Google Cloud Storage connector and a related connection on Glue Studio console
To set up the Google Cloud Storage connector and create a connection for your job:
Subscribe to product and activate the connector using AWS Glue Studio from the top of this instruction page.
Enter your connection name and choose "Create connection and active connector".
Step 4: Create a job
To create a job from your connection which is created in the previous step:
Choose the connection and "create job".
Select your created connection figure on the visual canvas.
Add connection options and enter the necessary information.
Follow the earlier step (Connector options you need to set) to add the connection options
Or you can create an AWS secret with the connector options and attach with the connection as stated in the above step 2.1
Enter the job name, choose IAM Role created in step 1, and other properties in the "Job details" tab, and Choose "Save"
Step 5: Save and run the job
Run the job after filling in all parameters and creating the connector job.
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Elasticsearch connector for AWS Glue, with Spark DataSource interface.
For new customers, we recommend using the newer version: https://aws.amazon.com/marketplace/pp?sku=jgxj9mdq1krf0baz03cvpgnv.
Red Hat OpenShift Service on AWS is a fully-managed and jointly supported Red Hat OpenShift offering that combines the power of the industry's most comprehensive enterprise Kubernetes platform, and the AWS public cloud. Access the service from the AWS Console at https://console.aws.amazon.com/rosa