AWS Database Blog

Implementing DB Instance Stop and Start in Amazon RDS

This post is from Matt Merriel at AWS partner Kloud, and Marc Teichtahl, manager for AWS Partner Solutions Architecture Australia and New Zealand. Kloud uses the new stop and start capabilities in Amazon RDS to lower costs for customers who don’t require 24×7 access to their databases during the testing and development phases of their projects.

Kloud provides professional and managed services to companies that are moving to the cloud. Amazon Relational Database Service (Amazon RDS) makes it easier for Kloud customers to go from project conception to deployment in a highly scalable, cost-effective, available, and durable fashion.

This post outlines deployment considerations, including security policies, filtering, and automation, to help you implement the DB instance stop and start capability in your environment.

Stopping and starting a DB instance—how it works

With the recently released DB instance stop and start feature in Amazon RDS, you can now stop and start database instances for up to seven days at a time. This makes it easier and more affordable to use Amazon RDS databases for development and test purposes when the database is not required to run all the time. This capability is similar to the feature available for Amazon EC2 instances.

To ensure ease of automation and management, the stop and start capability of Amazon RDS databases does not delete any previous backups, snapshots, or transaction logs. They simply remain in place when the database is stopped. As an added level of durability, the Amazon RDS service automatically backs up the DB instance while stopping it. This enables you to do point-in-time database restoration to any point within your configured automated backup retention window and decreases the time required for the initial backup after the DB instance is started. You aren’t charged while the database instance is stopping.

Starting an instance restores that instance to the configuration at the point in time when the instance was stopped. This includes endpoint configuration, DB parameter groups, option group membership, and security groups.

It’s possible to make changes to an option group that is associated with a stopped DB instance. If these changes are immediately applied, Amazon RDS applies the changes the next time the DB instance is started. Otherwise, Amazon RDS applies the changes during the next maintenance window after the stopped database instance has been started.

It’s important to understand that with all Amazon RDS databases, both persistent and permanent options can’t be removed from a DB instance option group if DB instances are associated with that option group. This functionality is also true of stopped instances. For example, calling the rds_remove_option_from_option_group command fails on a persistent option in an option group that is associated with a DB instance that is in the stopping, stopped, or starting state.

It’s also possible to make changes to a parameter group that is associated with a stopped database instance. If these changes are immediately applied, Amazon RDS applies the changes the next time the stopped database instance is started. Otherwise, Amazon RDS applies the changes during the next maintenance window after the stopped database instance is started.

An Amazon RDS DB instance may be in a stopped state for up to seven days at a time, after which it is automatically started. This is to ensure that required maintenance is applied to stopped instances. If the instance needs to be stopped again, you can do it using lifecycle hooks, Amazon CloudWatch, and AWS Lambda. All Amazon RDS database instance types support the stop and start capability. The feature is not currently supported for Amazon Aurora.

Deployment considerations

The primary use cases that Kloud implements for their customers focus on cost optimization and operational efficiencies within testing and development environments. As such, it’s not necessary to deploy configurations that use read replicas, require high availability via Multi-AZ deployments, or need SQL Server mirroring.

A database instance can still have a snapshot performed while in a stopped state. It’s also worth noting that, like stopping an Amazon EC2 instance, charges are still incurred for the storage and backups of the DB instance regardless of the state.

To ensure consistency and durability of the DB instance and its associated data, once an Amazon RDS DB instance has been stopped, it can’t be modified. This includes migrating a stopped instance into a Multi-AZ deployment, creating a read replica, or deleting a parameter or option group.

If an existing database is required to become Multi-AZ, or it requires a read-replica or deletion of a parameter or option group, you can create a second DB instance with the required capabilities by restoring a snapshot of the stopped database.

Security policies

AWS Identity and Access Management (IAM) is a web service that helps you manage users and user permissions in AWS along with providing granular management of access to AWS resources.

As with all AWS services, a policy configuration is required to enforce appropriate access to the Amazon RDS stop and start capability. You can do this by allowing the rds:startDBInstance  and rds:stopDBInstance actions with the appropriate IAM policy. The following example shows a simple example of such a policy:

{
    "STATEMENT": [{
            "EFFECT": "ALLOW",
            "ACTION": [
                "RDS:DESCRIBEDBINSTANCES",
                "RDS:STARTDBINSTANCE",
                "RDS:STOPDBINSTANCE"
            ]
            "RESOURCE": [
                "*"		

DB instance filtering

Many AWS customers use tags and CloudWatch to identify and respond to events that occur in their environment. This approach also applies to Amazon RDS DB instances when you use the describe-DB-instances command line interface (CLI) command. It’s useful when you want to identify those instances tagged as being able to stop and start. By filtering on a tag value—for example, stopstart—you can identify these instances and take an appropriate action. Alternatively, you can use the list-tags-for-resource CLI command to list all tags that are allocated to a specific instance. However, this approach would require you to already know the instance before querying.

As with all AWS CLI-based approaches, it is important that you are aware of the rate of AWS API calls being made to avoid a THROTTLED error. One way to ensure that a CLI call is executed, even after a THROTTLED event, is to build in an exponential backoff capability. For information on how to achieve this through API retries, see Error Retries and Exponential Backoff in AWS.

Automating stopping and starting

Unlike scheduling the startup and shutdown of Amazon EC2 instances (which can be used as a target of a CloudWatch event), you can automate the Amazon RDS stop and start capability with AWS Lambda and API calls using a schedule (cron) event source as the trigger. This is a relatively trivial task. To learn more about the required API calls, see StopDBInstance.

The following is an example of such a Lambda function:

exports.handler = (event, context, callback) => {

  if (hasFilterKey(callback)) {
    getDayTime()
  }

  rds.describeDBInstances(params, function (rdserr, rdsdata) {
    if (rdserr)
      callback(rdsdata, null)

    rdsdata.DBInstances.forEach(function (dbInstance) {
      var rdstagParams = {
        ResourceName: dbInstance.DBInstanceArn
      }

      rds.listTagsForResource(rdstagParams, function (tagerr, tagdata) {
        var toStartup = 0
        var toShutdown = 0
        var isInEnvironment = 0

        // an error occurred
        callback({
          error: tagerr,
          stack: tagerr.stack
        }, null)

        var tags = tagdata.TagList || []
        tags.forEach(function (tag) {
          if (tag.Key == 'Environment' && tag.Value == event.FilterValue) {
            isInEnvironment = 1
          }

          if (tag.Key == 'StartUp') {
            var StartupSchedule = tag.Value.split(' ')
            if (StartupSchedule[current_day] == current_hour) {
              toStartup = 1
            }
          }

          if (tag.Key == 'Shutdown') {
            var ShutdownSchedule = tag.Value.split(' ')
            if (ShutdownSchedule[current_day] == current_hour) {
              toShutdown = 1
            }
          }
        })

        if (isInEnvironment) {
          if (toStartup) {
            var startparams = {
              DBInstanceIdentifier: dbInstance.DBInstanceIdentifier /* required */
            }
            rds.startDBInstance(startparams, function (starterr, startdata) {
              if (starterr)
                callback(starterr, null)
              else
                callback(null, startdata)
            })
          }

          if (toShutdown === 1) {
            var shutdownparams = {
              DBInstanceIdentifier: dbInstance.DBInstanceIdentifier
            }
            rds.stopDBInstance(shutdownparams, function (stoperr, stopdata) {
              if (stoperr)
                callback(stoperr, null)
              else
                callback(null, stopdata)
            })
          }
        }
      })
    })
  })
}

You can also use AWS Lambda with scheduled events to automate a scheduled stop or start of Amazon RDS DB instances. For more information, see Using AWS Lambda with Scheduled Events.

Conclusion

Kloud has deployed the Amazon RDS stop and start capability in a number of their customers’ environments. Even after a short time, these customers are seeing up to a 50 percent reduction in their dev/test costs. As with all AWS services, the features and capabilities will be developed and enhanced over time, making the implementation process simpler and quicker.

By implementing the security policies, filtering, and automation outlined in this post, you can achieve improved deployment velocity for test and development environments, lower costs, and better operational efficiencies. AWS partner Kloud has seen its customers greatly increase the number of concurrent development environments that they can run at a reduced cost, resulting in faster and more frequent deployments. This in turn has allowed these customers to respond to market demands and technology changes with greater speed and confidence.

View the Turkic translated version of this post here.