Posted On: Jun 8, 2023

Amazon Athena for Apache Spark now allows you to use your own Java libraries and customize the Spark configurations for your Spark workloads. You can use Java libraries as custom JARs with Athena Spark to analyze data from multiple sources or use functions in custom jars for more flexibility with calculations.

Amazon Athena for Apache Spark is a feature of Amazon Athena lets you run interactive analytics on Apache Spark in under a second to analyze petabytes of data. You can now include your own Java libraries and modules (as JAR files) in Spark workloads to connect to different data sources and run advance calculations using user defined functions to perform feature exploration. Additionally, you can also set Spark configurations in Athena for your sessions such as to provide custom settings required by your Java packages or to access AWS Glue catalogs across accounts to support data mesh like design patterns. This launch includes a set of reference connector packages for Amazon CloudWatch logs, CloudWatch metrics and Amazon DynamoDB so that you can use data from the services in your insights.

Support for custom Java libraries and for custom Spark configurations is available in 9 AWS regions where Amazon Athena for Apache Spark is available: US East (Ohio), US East (N. Virginia), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Mumbai). To learn more and get started, visit the Amazon Athena for Apache Spark documentation page.