How can I schedule queries for an Amazon Redshift cluster?

4 minute read
0

I am trying to schedule a query to run in an Amazon Redshift cluster. How can I do that?

Short description

You can use the query editor to run, view, save, and schedule SQL queries for your provisioned Amazon Redshift clusters. Scheduled queries are initiated by Amazon EventBridge.

Resolution

Note: If you receive errors when running AWS Command Line Interface (AWS CLI) commands, make sure that you’re using the most recent AWS CLI version.

Query editor permissions

To use the query editor, users must have a minimum set of permissions to Amazon Redshift. If your cluster uses other AWS services, users must also have permissions to those services. For example, AWS Secrets Manager secrets and AWS Key Management Service (AWS KMS) keys. For more information, see Permissions required to use the Amazon Redshift console query editor.

You can use AWS Identity and Access Management (IAM) managed policies to manage permissions based on IAM roles. For more information, see AWS managed (predefined) policies for Amazon Redshift.

To schedule queries, the IAM user and role associated with the query must have permission to manage the query editor. For instructions, see Setting up permissions to schedule a query on the Amazon Redshift console.

Scheduling queries using the AWS Management Console

To schedule the query using the AWS Management Console, follow these steps:

Note:

  • You must use the query editor v1 to schedule queries.
  • You can use this query editor only with provisioned clusters.

1.    Open the Amazon Redshift console.

2.    In the navigation pane, choose Query editor.

3.    Follow the instructions to connect to a database in your cluster.

4.    Follow the instructions to create a schedule to run a SQL statement.

5.    If you're using AWS Secrets Manager with Amazon Redshift, then choose your secret.

6.    If you're using temporary IAM user credentials, choose your cluster and provide the database and user names. Note: GetClusterCredentials permission must be present for the IAM user to use the temporary credentials.

7.    Choose Save.

Note: If you turned on Simple Notification Service (Amazon SNS), then make sure that the "sns:Publish" API action is present for "events.amazonaws.com". To confirm, add the following statement:

{
  "Sid": "Allow_Publish_Events",
  "Effect": "Allow",
  "Principal": {
    "Service": "events.amazonaws.com"
  },
  "Action": "sns:Publish",
  "Resource": "arn:aws:sns:us-east-1:{ACCOUNT_ID}:{SNS_TOPIC_NAME}"
}

Scheduling queries using a SQL statement with EventBridge

You can schedule SQL statements using EventBridge and the Amazon Redshift data API. In the following example, a schedule is set up to refresh current events every 5 minutes.

1.    Create an EventBridge event rule named "scheduled-refresh-currentEvents" and schedule it to run every 5 minutes using cron or rate expressions similar to the following:

aws events put-rule \
--name scheduled-refresh-currentEvents \
--schedule-expression "cron(0/5 * * * ? *)"

2.    Create and save a JSON file named refreshCurrentEvents.json using your favorite text editor that contains the cluster, role, and the AWS Secrets Manager ARN:

{
  "Rule": "scheduled-refresh-currentEvents",
  "EventBusName": "default",
  "Targets": [
    {
      "Id": "scheduled-refresh-currentEvents",
      "Arn": "<Clsuter ARN>",
      "RoleArn": "<Role ARN>",
      "RedshiftDataParameters": {
        "SecretManagerArn": "<Secrets Manager ARN>",
        "Database": "<Database Name>",
        "Sql": "REFRESH MATERIALIZED VIEW currentEvents;",
        "StatementName": "scheduled-refresh-currentEvents",
        "WithEvent": true
      }
    }
  ]
}

3.    Create an event target using the saved JSON file:

aws events put-targets --cli-input-json file://refreshCurrentEvents.json

4.    To remove the target and delete the rule, run the following commands:

aws events remove-targets --rule scheduled-refresh-currentEvents --ids scheduled-refresh-currentEvents
aws events delete-rule --name scheduled-refresh-currentEvents

View SQL status and results

To view the query status and result, follow these steps:

1.    Open the Amazon RedShift console.

2.    In the navigation pane, choose Query editor, and then choose Scheduled queries.

3.    Choose your query name.

4.    In Schedule history, note the ID. You will need the ID in step 7.

5.    To retrieve the query results, assume the IAM role used to schedule the query using the AWS CLI command assume-role similar to the following:

aws sts assume-role --role-arn "<Role ARN>" --role-session-name AWSCLI-Session

The output returns the access key ID, secret access key, and session token.

6.    Configure environment variables using the following export commands:

export AWS_ACCESS_KEY_ID=<RoleAccessKeyID>
export AWS_SECRET_ACCESS_KEY=<RoleSecretKey>
export AWS_SESSION_TOKEN=<RoleSessionToken>

7.    Retrieve the results using the AWS CLI command redshift-data similar to the following:

aws redshift-data get-statement-result --id <ID> --region <Region>

Related information

Why can't I connect to the Amazon Redshift query editor?

Using the Amazon Redshift Data API to interact with Amazon Redshift clusters

Creating an Amazon EventBridge rule that runs on a schedule

AWS OFFICIAL
AWS OFFICIALUpdated a year ago