Amazon Athena for Apache Spark

Run interactive analytics on Apache Spark in under a second

Why Athena on Apache Spark?

Get started with interactive analytics using Amazon Athena for Apache Spark in under a second to analyze petabytes of data. Interactive Spark applications start instantly and run faster with our optimized Spark runtime, so you spend more time on insights, not waiting for results. Build Spark applications using the expressiveness of Python with a simplified notebook experience in an Athena console or through Athena APIs. With the Athena serverless, fully managed model, there are no resources to manage, provision, and configure and no minimum fee or setup cost. You only pay for the queries that you run.

Benefits

Spend more time on insights, not on waiting for results. Interactive Spark applications start in under a second and run faster with our optimized Spark runtime.
Use the expressiveness of Python with the popular open-source Spark framework to seek more complex insights from your data. Use notebooks to query data, chain calculations, and visualize results.
Run Spark applications cost-effectively, without provisioning and managing resources. Build Spark applications without worrying about Spark configurations or version upgrades.
Work with data in various data lakes, in open-data formats, and with your business applications without moving the data. Use data discovered and categorized by AWS Glue to build your Spark insights.

Use cases

Use Athena and AWS Glue to explore datasets and work with data. 

View various datasets and data formats together to formulate insights.

Build SaaS applications that use Athena for Apache Spark to interactively work with data.

Explore data stores to generate sample datasets and interactive feature generation.