Proactively Monitoring System Performance on Amazon Lightsail Instances
This post is contributed by Mike Coleman, AWS Senior Developer Advocate – Lightsail
I commonly hear from customers that they want to be able to proactively identify issues that could affect system performance before they become a problem. For instance, the ability to be alerted before an instance might become unresponsive to a burst in traffic due to exhaustion of all its CPU burst capacity. Burst capacity can be consumed either from a workload that needs to operate in the burstable zone for long periods of time, or unexpected CPU consumption by system processes. In either case, you’d want to be notified so you could take corrective action such as moving to a larger instance or stopping errant processes. To that end, today Amazon Lightsail launched a new feature allowing you to set up custom alarms to be notified when your burst capacity is running low.
Amazon Lightsail instances use burstable CPUs. These CPUs operate in two different zones: the sustainable zone and the burstable zone. The sustainable zone is based on the CPUs baseline performance. As long as the CPU utilization stays below this baseline, the system will perform with no impact to system responsiveness. The burstable zone is entered whenever the CPU utilization climbs above the baseline. The instance can only operate in this zone for a finite amount of time (more on that below) before system performance is negatively impacted.
Earlier this year, we announced the release of resource monitoring, alarms, and notifications for Amazon Lightsail instances. This feature introduced a graph that shows when an instance operates in the CPUs burstable zone. However, this was only a partial solution since there was no easy way for you to know how long your system could effectively operate in the burstable zone.
With today’s new feature, Lightsail has augmented that functionality by allowing you to see how much burstable capacity is available to your system at any given time. Additionally, you can create alarms and be notified when that burstable capacity has dropped to a critical level. This allows users to be proactively notified that there is a potential performance issue developing.
The remainder of this blog runs through how to configure a burstable capacity alert so you can prevent system performance issues before they impact your users.
Burstable capacity overview
Before I begin, it’s important to understand how burst capacity minutes are calculated. One minute of the CPU running at 100% is one minute of CPU burst capacity. By the same token, one minute of the CPU running at 50% is 30 seconds of CPU burst capacity. So, if a system has 72 minutes of available burst capacity, and it’s running at 50% CPU it can run in that state for 144 minutes before system performance is impacted.
CPU burstable capacity can be displayed two ways: percentage available and minutes available, with the default being percentage. CPU burst capacity percentage is a simple calculation of dividing the available burst capacity (minutes) by the maximum available burst capacity (minutes). For example, an instance with 36 minutes of burst capacity left out of a maximum available limit of 72 minutes would be at 50%.
Configuring a Burstable Capacity Alert
Let’s take a look at how to configure an alert for CPU burst capacity (percentage).
Note: If you’re not familiar with the general concepts around how to configure alerts and notifications in Lightsail, you can read this blog post.
- From the Lightsail home page click on the instance you wish to create the alert for.
- From the horizontal menu, choose Metrics.
- You should now see the graphs for both CPU utilization and burst capacity.
Notice how the CPU utilization graph shows the sustainable and the burstable zones.
The burst capacity graph shows you the current percentage of burst capacity you have remaining of minutes that you before system performance is impacted.
In the following graphs, you can see that the CPU is operating at about 15% and has just over 90% of remaining burst capacity.
- Scroll down and click CPU burst capacity (percentage) under
- Click +Add alarm.
- For my alarm, I choose to be notified whenever the percentage of available CPU burst capacity drops below 25% for 10 consecutive minutes.
- At this point, you can enable notifications via email or SMS message (instructions on how to do this can be found in the blog post I linked earlier). Regardless of whether you choose to enable notifications, you’ll receive a banner in the Lightsail console whenever your alarm threshold is breached.
- Click Create to create your alarm.
Lightsail is now configured with a single alarm. I consider it a best practice to configure two alarms. The first is a warning level, and the second is the critical level. This ensures that you have additional time to respond to developing problems. If you’d like to create a second alarm, just repeat the previous steps.
In this blog, I covered the concepts behind Lightsail’s burstable CPUs, and how you create alarms to respond before your system runs out of burst capacity. If you see that your system is running out of bursting capacity frequently, you should investigate the processes running on your instance looking for any that are consuming extra CPU. Or you should consider upgrading your instance to a larger plan. Read more about this latest feature here, and check out our Getting Started page for more tutorials and resources.