Manage databases through custom skills with Amazon Alexa and AWS Systems Manager

Over the years, customers have used Amazon Alexa voice commands to order supplies, listen to music, support meetings, manage home devices, and get weather and news updates. But what about AWS resource management?

AWS managed and fully managed services already reduces your administrative tasks, letting you focus your resources on applications. Now, voice interaction can further reduce your administrative effort:

IT administrators can control and manage infrastructure
Developers can check metrics and perform predefined tasks securely
Managers can keep tabs on IT resource status and metrics

Moreover, they can now perform these tasks without touching a keyboard.

In this post, I cover how to configure your infrastructure to manage your Amazon RDS instances or your Amazon Aurora instances using Alexa voice commands. Ask Alexa to email your database instance metrics, configuration, and status, or even to perform reboots or failovers.

Prerequisites

Proper voice management of your AWS resources involves access to a combination of AWS services, including:

Alexa, Amazon’s cloud-based voice service, enables you to interact intuitively with the technology you use every day. To help you make the most of the service, Amazon offers the Alexa Skills Kit, a collection of tools and APIs to build voice experiences on Alexa. You can host Alexa skills in several ways:

Alexa hosted skills
On your web service
With a Lambda function invoked by voice using an Amazon Echo device (the method used for this post)

For all the instructions to create and test your own private Alexa skill, see the aws-systems-manager-database-voice-commands GitHub repo. In the next section, I go over what you’re going to build.

Alexa skill

Alexa skills consist of voice commands enabling Alexa operations. Each Alexa skill needs a proper interaction model, including the following resources:

An invocation, which signals Alexa to start the interaction
Intents, which signal known tasks to Alexa
Slot types, which specify or narrow the tasks requested

As an administrator, developer, IT manager, or other IT professional, you can ask Alexa to get information and execute commands from or to a database fleet running on AWS. To begin, personally invoke the service, give Alexa the task by stating a recognized intent.

I group the intents relevant to Systems Manager into four categories:

Inventory
- NumberOfInstances
- ListInstances
- ListApplications
- ListInstancesForApplication

Checks
- CheckInstanceMetric
- ListAlarms
- CheckInstanceProperty
- TablespaceUsage (only for Oracle)
- ShowTopSession (only for Oracle)
- ListEvents
- CheckInstanceParameter (only for Oracle)
- CheckLog

Tasks
- ConnectInstance
- RebootInstance
- FailoverInstance

Others
- SetNotifications
- Built-In-Intents

Before describing these intents in more depth, let me describe the main architecture of the system.

System architecture and the importance of tags

The infrastructure for managing your RDS database fleet through voice commands is shown in the following diagram:

Implemented by a Lambda function, the Alexa skill communicates with many other AWS services. Specifically, the Alexa skill:

Retrieves information directly from RDS (for example, intent ListInstances).
Runs tasks directly on RDS (for example, intent FailoverInstance), interacting with AWS Security Token Service (STS) when multi-factor authentication (MFA) is required.
Retrieves information running Systems Manager commands based on scripts stored into an EC2 instance (for example, intent TablespaceUsage).
Runs tasks running Systems Manager commands based on scripts stored into an EC2 instance (for example intent ShowTopSession).
Retrieves metrics and alarms from CloudWatch (for example, intent CheckInstanceMetric).
Sends the content of database logs using SNS (for example, intent CheckLog).
(If enabled) Sends the results of each single task execution back to the user through email using SNS (intent SetNotifications).

In this post, I show you how to work on a heterogeneous database fleet, involving various engines, as shown in the following screenshot:

To manage these resources properly through Alexa, you must set up some specific tags:

Tag “application” to let Alexa answer to questions like “list instances of application ‘training’ in Oregon.”
Tag “instance” to let Alexa answer to questions like “status of ‘Blog’ instance in Oregon.”
Tag “environment” to let Alexa answer to questions like “number of ‘production’ instances in Oregon.”

The “instance” tag provides a more readable label for humans (and Alexa) than the given, typically character-limited RDS instance name.

Retrieve information from your fleet

One intent that you can use to retrieve information from the database fleet is ListInstancesForApplication:

User: List instances of application BLOG in OREGON Alexa: List of the RDS instances for the application BLOG in OREGON: db11204a db12201a db12201b [..]

This intent triggers a Lambda function, which calls the describe-db-instances RDS command to retrieve a list of the databases running in a specific Region, filtered by application. In this case, Alexa supplies the information out loud as a vocal reply.

When Alexa responds to intents with the potential to generate higher numbers of occurrences, code limits outputs to a maximum of 20 results and the API actions run with default page-size attributes.

Check your fleet

Are you interested in the performance of one database instance? Use the intent CheckInstanceMetric to find the value metric for a specified CloudWatch instance and time window:

User: "CPU Utilization for REPORTING instance in OREGON 30 MINUTES AGO” Alexa: 80% percent of CPU Utilization for the REPORTING instance in OREGON

The Lambda function retrieves the database instance’s DB instance identifier through the RDS describe-db-instances command. Then the Lambda function calls the CloudWatch get-metric-statistics command to obtain the exact metric value.

Next, verify whether any CloudWatch alarms defined for a specific database instance fired—and, if so, which ones. The intent ListAlarms serves this purpose. By calling the CloudWatch DescribeAlarms action, you can get a list of the alarms with the state equal to ALARM, filtered by the alarm prefix:

User: "List of alarms fired for REPORTING instance in OREGON” Alexa: The following alarms have the ALARM state for the REPORTING instance in OREGON: cloudwatch-alarm-reporting-DatabaseConnections cloudwatch-alarm-reporting-SwapUsage [..]

Even if a database instance’s status appears as “available,” the instance might not be reachable. Use the intent ConnectInstance to test the database connection. This intent runs a .sql script through the Systems Manager Run Command tool:

User: connect to HR instance in Oregon Alexa: Database reachable

The system stores the .sql script on the EC2 instance, using it as a database client.

You don’t have to store any credentials inside the .sql script or inside the Lambda function code. The master user password used to run the script is stored inside a parameter managed by Systems Manager with the Parameter Store tool, as shown in the following code example:

ssm = boto3.client('ssm')
response = ssm.get_parameter(Name="/rds/" + instance_name + "/dbpassword")

dbpassword = response['Parameter']['Value']

The described mechanism also applies to other intents, like TablespaceUsage or ShowTopSession. The latter intent is an example of performance monitoring. In this case, you can retrieve the top session for a specific metric (for example, CPU, memory, DB time, and so on).

User: show top session by CPU for REPORTING instance in Oregon Alexa: The top session by CPU is the following: Session ID 3345 and Username RDSADMIN

For security purposes, database security groups should always restrict access to the authorized IPs, IP ranges, and security groups. In this case, databases can be accessed only from the EC2 instance containing the .sql scripts.

The EC2 instance storing the .sql scripts should exist in every Region where you manage your databases. The server acts as database client and at the same time as a “bastion,” running in a public subnet of the database’s home VPC. At least, this is the network configuration put in place for this post.

When you ask Alexa to test the connection to a database in a specific Region, the right EC2 instance is identified using the associated tag. The system stores this tag as a parameter in the Parameter Store tool provided by Systems Manager.

Check logs

RDS allows you to view, download, and watch database logs using the RDS console, the AWS CLI, or the RDS API. The list of database log types available depends on the engine used. For more information about RDS database logs, see Amazon RDS Database Log Files.

Using the RDS download-db-log-file-portion command, you can extract the most important database log content from your database instance.

User: check alert log of REPORTING instance in OREGON Alexa: WARNING: errors found in the log

The content of the requested log can be filtered to spot generic or specific errors.

Run tasks against your fleet

You can now reboot or failover a database instance through voice using the RebootInstance and the FailoverInstance intents, respectively. When you call the RebootInstance intent, the Lambda function calls the reboot_db_instance command directly, which in turn reboots a specific database instance or read replica (RDS or Aurora). If you call FailoverInstance intent, the Lambda function calls either:

The reboot_db_instance command directly (with the ForceFailover parameter set to TRUE), if you are working with an RDS instance;

The failover_db_cluster command, if you are working with Aurora.

Because both actions count as “critical,” protect the related API actions with MFA. MFA adds extra security, requiring users to provide unique authentication in addition to their AWS sign-in credentials. Users need an AWS-supported MFA mechanism to generate a six-digit numeric code — the MFA token. For more information, see enable MFA devices.

For this demo, every time you interact with Alexa asking for a reboot or a failover, you must provide an MFA token to authenticate yourself. The token, provided to Alexa through the voice together with the command, is sent to STS through the GetSessionToken action.

AWS STS enables you to request temporary, limited-privilege credentials for IAM users or for federated users whom you authenticate. Users rely on these temporary credentials for the authorization to perform failovers and reboots.

Here’s a challenge: The GetSessionToken action provided by STS cannot be requested directly by the Lambda function. When a Lambda function executes, it automatically assumes its IAM execution role to interact with other AWS services. But GetSessionToken must be called using an IAM user’s long-term AWS security credentials, and not through an IAM role. So how can you request a reboot or failover using Alexa?

The solution consists in the creation of a separate IAM user with the following characteristics:

Programmatic access enabled
No policies associated
No groups assigned
MFA authentication enabled

The AWS CloudFormation template used to build the demo provides an IAM user’s long-term AWS security credentials as input parameters and stores them as encrypted parameters with Parameter Store.

When you request a reboot or a failover through voice, the Lambda function confirms the validity of the provided MFA token by calling the previously described GetSessionToken. Here’s the code run when you call the RebootInstance intent:

if IsMFAValid('mfatoken'):
  rds.reboot_db_instance(DBInstanceIdentifier=identifier, ForceFailover=False)
  final_str = "Reboot in progress for the " + instance_name + " instance in " + region
else:
  final_str = "Reboot not authorized for the " + instance_name + " instance in " + region

This code represents the IsMFAValid function. The cited token links with an IAM user with no permissions (see the earlier section in this post), used only authorize reboot or failover:

def IsMFAValid(token):
    # The GetSessionToken function throws an error when the token is invalid
    try:
        stscall.get_session_token(
            SerialNumber='arn:aws:iam::959815022820:mfa/nopermissions',
            TokenCode=token,
        )
        return True
    except:
        return False

Here is an example of an interaction with Alexa:

User: failover BLOG instance in OREGON mfa 645 281 Alexa: Failover in progress for the BLOG instance in OREGON

What’s next

You can also manage database passwords using AWS Secrets Manager—a service that helps you manage and retrieve database credentials, API keys, and other secrets. Secrets Manager even lets you securely rotate these sensitive items without deploying code.

New services like Amazon Lex or Amazon Polly help you create interactive web and mobile applications where you manage their data through voice and chat—and can see the results onscreen, immediately.

Finally, you can also learn to manage more advanced aspects of your databases with your voice. For example, you can activate or deactivate the Multi-AZ feature of a specific instance; add or remove a read replica; or manage cloning, restore, and recovery operations.

Conclusion

In this post, I showed how you can use Amazon Alexa to securely monitor and manage an RDS database or an entire fleet. With proper standardization and automation in place, you can now use your voice to manage complex data environments, even an entire infrastructure. IT administrators and anyone authorized to work with these resources can retrieve information from their systems quickly and securely, through email or SMS and use them for their purposes.

Please feel free to ask questions and to share your thoughts. We’d love to hear what you think!

If you have questions or suggestions, please leave a comment.

About the Author

Marco Tamassia is a technical instructor at AWS.

AWS Database Blog

Manage databases through custom skills with Amazon Alexa and AWS Systems Manager

Prerequisites

Alexa skill

System architecture and the importance of tags

Retrieve information from your fleet

Check your fleet

Check logs

Run tasks against your fleet

What’s next

Conclusion

About the Author

Resources

Blog Topics

Follow

Learn

Resources

Developers

Help