How can I schedule an Amazon Athena query?

Last updated: 2021-12-20

I want to schedule queries in Amazon Athena.

Short description

Scheduling queries is useful in many scenarios, such as running periodic reporting queries or loading new partitions on a regular interval. Here are some of the ways that you can schedule queries in Athena:

Resolution

To schedule an Athena query using a Lambda function and an EventBridge rule:

1.    Create an AWS Identity and Access Management (IAM) service role for Lambda. Then, attach a policy that allows access to Athena, Amazon Simple Storage Service (Amazon S3), and Amazon CloudWatch Logs. For example, you can add AmazonAthenaFullAccess and CloudWatchLogsFullAccess to the role. AmazonAthenaFullAccess allows full access to Athena and includes basic permissions for Amazon S3. CloudWatchLogsFullAccess allows full access to CloudWatch Logs.

2.    Open the Lambda console.

3.    Choose Create function.

4.    Be sure that Author from scratch is selected, and then configure the following options:

For Name, enter a name for your function.
For Runtime, choose one of the Python options.
For Role, choose Use an existing role, and then choose the IAM role that you created in step 1.

5.    Choose Create function.

6.    Paste your code in the Function code section. The following example uses Python 3.7. Replace the following values in the example:

default: the Athena database name
SELECT * FROM default.tb: the query that you want to schedule
s3://AWSDOC-EXAMPLE-BUCKET/: the S3 bucket for the query output

import time
import boto3

query = 'SELECT * FROM default.tb'
DATABASE = 'default'
output='s3://AWSDOC-EXAMPLE-BUCKET/'

def lambda_handler(event, context):
    query = "SELECT * FROM default.tb"
    client = boto3.client('athena')
    # Execution
    response = client.start_query_execution(
        QueryString=query,
        QueryExecutionContext={
            'Database': DATABASE
        },
        ResultConfiguration={
            'OutputLocation': output,
        }
    )
    return response
    return

7.    Choose Deploy.

8.    Open the Amazon EventBridge console.

9.    In the navigation pane, choose Rules, and then choose Create rule.

10.    Enter a name and description for the rule.

11.    For Define pattern, select Schedule.

12.    Select Cron expression, and then enter a cron expression.

13.    For Select event bus, select AWS default event bus.

14.    In the Select Targets section, do the following:

For Target, select Lambda function from the dropdown list. For Function, select the name of your Lambda function from the dropdown list.

15.    Choose Create.

If you're scheduling multiple queries, keep in mind that there are quotas for the number of calls to the Athena API per account. For more information, see Per account API call quotas.