AWS Cloud Operations & Migrations Blog

Visualizing Amazon CloudWatch Costs – Part 2 – Where does the data come from?

In part 1 of this series we explored an Amazon CloudWatch dashboard which provides a real-time view of some of the typical main contributors to CloudWatch costs. In this second post, we’ll look at how the CloudWatch dashboard widgets were created so that you can learn how to create something similar, or modify the widgets to suit your needs.

For a description of the widgets and how to use them, see part 1 of this blog series.

Within this post, we’ll be describing how we worked with two main types of widgets: metric widgets and custom widgets. This isn’t an exhaustive explanation of these features. However, along with AWS documentation, it should get you started and give you the capability to create something similar yourself. Note that custom widgets require code to be developed – we’ll highlight key parts of the code, but we aren’t attempting to teach you to code or create AWS Lambda functions. You may need to use additional resources and AWS documentation to support your needs.
Note that for all of the examples here, only regions which don’t required an opt-in have been included. See the AWS documentation on Enabling a Region for further details.

Metric widgets – data by region

The log ingest volume, log event count, and API metric count widgets are all created using CloudWatch metrics available by default.

Metric widgets are created for various metrics as a total for each region individually, and a total across all regions. We do this by using a combination of Metric Math Expressions. We’ll look at the “Total Log Ingestion by Region: Volume (GB)” widget shown in the following figure.

 Widget screenshot showing a widget displaying a line chart with Log Ingestion Volume (in GB) for each region individually, and as a total for all regions.]

Figure 1: CloudWatch widget showing Log Ingestion Volume (GB) by region.

Specifying more than one region can be achieved by modifying the metric source (see the following figure). This is a text editor where you can create the appropriate JSON for your widget. You can find the source tab when you create/edit a metric widget.

When you edit the source directly, remember to Save it (top right of the source code window – see the following screenshot). If you navigate away from the tab without doing this, then you’ll lose your changes. If your syntax is correct, then the Update Widget button will change from grey to orange after you Save. If it doesn’t change to orange, then there is likely an error in your source text.

Screenshot of the console showing the metric source editor for a metric widget. Source changes must be saved after making and changes, and before updating the widget.

Figure 2: Metric Source tab showing some source JSON, highlighting the source save button.

The source for the “Total Ingestion by Region: Volume (GB)” widget, shows the following parts:

  1. The IncomingBytes metric from the AWS/Logs namespace for each region.
[ "AWS/Logs", "IncomingBytes", { "label": "us-east-1", "id": "m1", "visible": false } ],
[ "...", { "region": "us-east-2", "label": "us-east-2", "id": "m2", "visible": false } ],
	… continue for each region …
  1. A metric math expression to convert all of these from bytes to GB.
[ { "expression": "METRICS()/1000000000", "label": "", "id": "e1", "region": "us-east-1"} ],
  1. A math expression with the SUM function to get the total over all of the regions. Again, we convert from bytes to GB.
[ { "expression": "SUM(METRICS())/1000000000", "label": "All regions", "id": "e2" } ],

A few things to note:

  • The order of the metrics determines the order in which they will appear in the legend.
  • The IncomingBytes metrics are set as visible: false as we don’t want to display the original data in bytes, but rather display the data once converted to GB.
  • The three dots in the metric source are a shorthand to show that there is repeated data. In this case, the Namespace and Metric name are the same as the previous line. Even if you enter the full information, CloudWatch will shorten this once you have saved the source.
  • Each metric entry has an ID – m# is used for metrics, and e# for expressions.
  • When we use METRICS(), this math expression will be applied to all of the metrics that you have in your graphed metrics tab (only raw metrics, not expressions). Therefore, you’ll see a new series for each region. For more information, see the AWS documentation on the METRICS() function.
  • Sample metric math expressions, like converting bytes to GB, can be found when you create your math expression from the Graphed metrics tab. In this case, choose Add math > Common > Bytes to GB.
  • The expression for converting from bytes to GB has an empty label. The label from the underlying metrics will be carried through (in this case, the region).
  • The overall source code shows the Statistics as Sum, and the period as 86400 seconds (or one day).
{
    "metrics": [
        …
    ],
    "view": "timeSeries",
    "stacked": false,
    "region": "us-east-1",
    "period": 86400,
    "stat": "Sum"
}
  • When you return to the source code, the region property may have been removed from the metric which is in the same region as your dashboard region (in our case this is us-east-1). You can see the dashboard region stated in the region property.

The widgets for the log event count for each region, and for the Metric API calls of GetMetricData, PutMetricData, and GetMetricStatistics are very similar to this (except none of these need the unit conversion of bytes to GB). These widgets are shown in the following figure. You can view the source of the individual widgets to see the exact approach used.

Widget screenshots showing a widget displaying a line chart with Log Event Count for each region individually, and as a total for all regions. Similar widgets are shown for GetMetricData, PutMetricData, and GetMetricStatistics.]

Widget screenshots showing a widget displaying a line chart with Log Event Count for each region individually, and as a total for all regions. Similar widgets are shown for GetMetricData, PutMetricData, and GetMetricStatistics.]

Figure 3: CloudWatch widgets showing Log Event Count by region, and Metric API call counts for GetMetricData, PutMetricData, and GetMetricStatistics.

The namespace and metric name used for these widgets are:

  1. Log event count
    • Namespace: AWS/Logs > Log Group Metrics
    • Metric Name: IncomingLogEvents
  1. GetMetricData/PutMetricData/GetMetricStatistics API Call
    • Namespace: AWS/Usage > By AWS Resource
    • Service=CloudWatch
    • Resource= GetMetricData/PutMetricData/GetMetricStatistics
    • Metric Name: CallCount

Metrics widgets – top 10 for all regions

Metrics widgets are created for top 10 by region for several sets of data on the dashboard. We’ll look at the widget for “Top 10 Log Groups by volume ( GB ): daily total” shown in the following figure.

Screenshot showing a widget displaying a line chart with the top 10 Log Groups by Ingestion Volume (in GB).]

Figure 4: CloudWatch widget showing the top 10 Log Groups by Ingestion Volume (GB).

The approach is similar to the previous example.

We’re still using the IncomingBytes under the AWS/Logs namespace, but this time for each log group.
Adding the metric for each log group would be tedious, and it wouldn’t allow the widget to include any new log groups created without us manually adding it into the widget.
We deal with this by using a metric math expression which uses both the Search and Sort functions.

SORT(SEARCH('{AWS/Logs,LogGroupName} MetricName="IncomingBytes"',"Sum",86400)/1000000000,MAX, DESC, 10)

This math expression does the following:

  1. Searches for the IncomingBytes metric for all of the log groups, and then aggregates each with a statistic of sum and a period of one day (86400s).
  2. Converts the data from bytes to GB.
  3. Finds the top 10 (DESC) log groups.

As before, we edit the metric widget source directly so that we can include multiple regions.[ { “expression”:

 [ { "expression": "SORT(SEARCH('{AWS/Logs,LogGroupName} MetricName=\"IncomingBytes\"',\"Sum\",86400)/1000000000,MAX, DESC, 10)", "id": "e1", "visible": false, "region": "us-east-1" } ],
 [ { "expression": "SORT(SEARCH('{AWS/Logs,LogGroupName} MetricName=\"IncomingBytes\"',\"Sum\",86400)/1000000000,MAX, DESC, 10)", "id": "e2", "visible": false, "region": "us-east-2" } ],
…

Again, we choose not to display these (visible: false), as we’re only interested in the top 10 over all regions. We can do this with another math expression. As before, each region expression has an ID (e1, e2, and so on), and those IDs are used in the combined expression. Note that we can’t use METRICS() as we did before because this only works on raw metrics, not expressions – see the METRICS() function.

SORT([ e1, e2, e3, e4, e5, e6, e7, e8, e9, e10, e11, e12, e13, e14, e15, e16, e17 ],MAX,DESC,10) 

A snippet of the metrics section of the final source JSON can be seen as follows.

"metrics": [
    [ { "expression": "SORT([e1, e2, e3, e4, e5, e6, e7, e8, e9, e10, e11, e12, e13, e14, e15, e16, e17], MAX, DESC, 10)", "label": "[${PROP('Region')}] ${LABEL}", "id": "s1", "region": "us-east-1" } ],
    [ { "expression": "SORT(SEARCH('{AWS/Logs,LogGroupName} MetricName=\"IncomingBytes\"',\"Sum\",86400)/1000000000,MAX, DESC, 10)", "id": "e1", "visible": false, "region": "us-east-1" } ],
    [ { "expression": "SORT(SEARCH('{AWS/Logs,LogGroupName} MetricName=\"IncomingBytes\"',\"Sum\",86400)/1000000000,MAX, DESC, 10)", "id": "e2", "visible": false, "region": "us-east-2" } ],
    
    … continue for each region …
    ],

You can construct the widget for the log event count for each region in a similar manner.
The pie chart display uses the same source, but in the Options tab you’ll choose a Pie widget type, and to show the Latest value shows the value from the most recent period of your chosen time range.

Custom widgets – Cost Explorer API

Three panel display, which shows widgets from a CloudWatch Dashboard. The widgets in this screenshot show CloudWatch costs broken down by AWS Account, by Region, and by both Region and Usage Type.]

Figure 5: CloudWatch widgets showing CloudWatch costs by Account, Region, and Usage Type.

The widgets shown in the figure above are displaying CloudWatch costs by account, region, and usage type. These widgets are created using custom widgets for CloudWatch Dashboards. A custom widget is a CloudWatch dashboard widget that calls a Lambda function. The CloudWatch custom widget displays the HTML or JSON returned from the Lambda function.

If you want to explore custom widgets in general, then AWS provides some sample custom widgets. To create a custom widget using one of the samples, follow the AWS documentation on Create a custom widget. In this case, choose the Cost Explorer report. Let AWS CloudFormation create the resources that you need, and then you can modify the Lambda as desired. In addition, if you want to do some hands-on exploration of custom widgets, and create some from scratch, then the One Observability Workshop has a section on creating your own custom widgets.

For the widgets shown in the figure above, we modified the sample costExplorerReport widget. This widget already contained a Lambda function with code to query AWS Cost Explorer, and create a bar chart display of the results using an HTML table.

The Lambda function is specified in the config for the custom widget – you can see this by editing the widget. For this example, the Lambda function for these widgets is called customWidgetCostExplorer-js–<stackname>. You can view the full code for this function in the Lambda console.
We made the following modifications:

  1. We wanted to make the widget more general, so we specified two parameters in the custom widget configuration (see the following figure).
    • displayType: the values could be Linked_Account, Region, or Usage_Type – this would allow us to use the same code for all three widgets. This defined the group by component for the Cost Explorer data.
    • service: this would let us use the widget to find Cost Explorer information for other services. In this case, the value would be AmazonCloudWatch.
Screenshot of the configuration for a CloudWatch custom widget. The parameters which are to be sent into the Lambda function can be seen, and in this case they are displayType with a value of Region and service with a value of AmazonCloudWatch.]

Figure 6: Configuration options for a CloudWatch custom widget, showing the Parameters to be sent to the Lambda function.

We won’t go through the code in detail here, but a few parts are worth highlighting.

Getting the parameters in the Lambda function

The input parameters are captured from the event.widgetContext.params object in the handler of the Lambda function as follows.
const widgetContext = event.widgetContext;

    const widgetContext = event.widgetContext;

    let displayType = widgetContext.params.displayType;
    if (typeof displayType === "undefined") {
        displayType = 'Usage_Type';
    }
    let service = widgetContext.params.service;
    if (typeof service === "undefined") {
        return "<br/><br/><b>Error:</b>Service must be defined in custom widget configuration."
    }

You’ll see that we have chosen to set a default displayType if none is defined. But for the service we display an error if it isn’t specified. You should make appropriate design decisions for your situation.

Getting the data from Cost Explorer

We define and retrieve the desired data from Cost Explorer using the AWS SDK for JavaScript for the Class AWS.CostExplorer. This code is contained in the getCostResults method.

const getCostResults = async (start, end, displayType, service) => {
    const ce = new aws.CostExplorer({ region: 'us-east-1' }); 
    let NextPageToken = null;
    let costs = [];
    do {    
        const params = { TimePeriod: { Start: start, End: end }, Granularity: 'MONTHLY', Filter: { Dimensions: { Key: 'SERVICE', Values: [ service ] } }, GroupBy: [ { Type: 'DIMENSION', Key: displayType.toUpperCase() } ], Metrics: [ 'UnblendedCost' ], NextPageToken };

        const response = await ce.getCostAndUsage(params).promise();
		
        costs = costs.concat(response.ResultsByTime);
        NextPageToken = response.NextPageToken;
    } while (NextPageToken);

    return costs;
};

You’ll see that we use the displayType and service parameters to specify the data that we want from Cost Explorer. We also use the start and end time set in the dashboard. The filter section lets us use the displayType data to control how the data is grouped. This lets us use this widget for all three of our cost widgets (shown in Figure 5). You can find the possible values for the displayType and service by using Cost Explorer in the console. See the AWS SDK documentation for getCostAndUsage for more details on the format and options available.
Note that we set the Granularity to monthly to reduce the amount of data returned. This doesn’t impact the time over which the data is collected (start and end still specify this), but how it’s grouped. In the collate method, we still combine the data from all time periods together – specifying the largest granularity just means we have less data to collate.

The display output

The Lambda for a custom widget must return the output as HTML for the dashboard widget to display. In this example, this is done in the tableStart and getCostResultsHtml methods.
Within the collate method, we sort the data (in descending order of cost), and find both the overall total and individual maximum value. These are used within the HTML created to determine the length of the bar created for each row, with the largest cost taking the full width of the table. The total cost is also displayed in the table header.
Your design decisions may be different to ours. We’ll leave you to explore the code further for yourself and make the desired modifications.

Custom widget – log storage API

Screenshot showing a widget displaying the top 10 CloudWatch log groups by total storage size.

Figure 7: CloudWatch widget showing the top 10 log groups for Log Storage (GB).

The figure above shows the top 10 log groups for log storage (GB). This data is presented using a custom widget which calls the DescribeLogGroups API to find the total storage per log group, and then displays the top 10.
For this example, the Lambda function for this widget is called customWidgetLogStorageVolume-js-<stackname>. You can view the full code for this function in the Lambda console.

It’s very similar to the custom widget just described for the Cost Explorer data. There are a few main points worth highlighting:

  • We don’t send any parameters into the Lambda function from the custom widget configuration.
  • We call the describeLogGroups of the CloudWatchLogs API rather than the CostExplorer API (in the getLogGroups method).
  • The CloudWatchLogs API only lets us get data for the current region, so we create a list of desired regions (getRegions method), and loop through each region to gather the full data.
    • Because Log groups names only have to be unique within their region, we add a region identifier to the returned data (in the addRegion method).
    • We disregard any data not in the top 10 log groups by storage volume.

Conclusion

In part 1 of this series, we introduced a CloudWatch dashboard which provides a better understanding of your CloudWatch costs. In this second part, we looked at how these widgets were created so that you can learn how to create something similar, or to modify the widgets to suit your needs.

Although we’ve discussed the specific Metric Math and custom widgets used in the CloudWatch costs dashboard, we encourage you to explore the concepts and how this might be useful to you.

For metric math, you might wish to look into using the search expression for partial matches on dimension values, or performing more in-depth calculations on the data.

For custom widgets, you can leverage the Lambda function to gather, manipulate, and display data from various sources. For example, you could gather data about your Amazon Elastic Compute Cloud (Amazon EC2) instances using the describeInstances method to find the status, or utilize any other AWS API to retrieve information about your resources or service. You could also create your custom widgets so that actions can be taken, perhaps to allow a specific runbook to be executed through AWS Systems Manager, or create/update an Ops Item. If there is an API method for it in the SDK, then this is an option. Check out the SDK for your chosen language to explore further options.

Some resources that may be useful to you:

About the authors:

Helen Ashton

Helen Ashton is a Sr. Specialist Solutions Architect at AWS on the Observability team. Helen is passionate about helping customers solve their business problems, and progress through their cloud journey. Outside work, she enjoys music, biking and gardening.

Bobby Hallahan

Bobby Hallahan is a Sr. Specialist Solutions Architect on the AWS Observability team. He is passionate about helping customers find innovative solutions to difficult problems. He works with AWS customers to help them meet their observability goals. During his tenure at AWS, Bobby has supported enterprise customers with their mission-critical workloads.