Monitoring
Overview
Monitoring application delivery helps detecting unusual events and respond to them appropriately. Amazon CloudWatch monitors Amazon Web Services (AWS) resources, including AWS Edge Services used for application delivery. For example, server side metrics emitted by AWS Edge Services helps detect unexpected increases in traffic volumes, significant drops in cache hit ratio, sharp rises in 5xx errors, or DDoS attacks. In addition to server-side metrics, CloudWatch collect and track metrics from client-side monitoring use CloudWatch RUM. Custom dashboards can be created using CloudWatch metrics, on which alarms can also be configured.
Server-side metrics
Native CloudWatch metrics emitted by AWS Edge Services
Consider the following near-real time CloudWatch metrics emitted by AWS Edge services when delivering and protecting your application:
- CloudFront emits the following metrics: Requests, Bytes downloaded, Bytes uploaded, 4xx error rate, 5xx error rate and Total error rate. Note that these metrics are available in us-east-1 region since CloudFront is a global service. For an extra cost, you can enable additional metrics such as cache hit rate, origin latency and error rate for specific status codes.
- CloudFront Functions emits the following metrics in us-east-1 region: invocations, validation errors, execution errors, compute utilization and throttles.
- Lambda@Edge is based on AWS Lambda, and as of such, it emits a subset of its metrics such as invocations, errors, duration, concurrent executions and throttles. In contrast with CloudFront Functions, Lambda@Edge metrics in each region where it's executed by CloudFront. The CloudFront console offers a consolidated view of these metrics across all regions.
- AWS WAF emits the following metrics: allowed requests, blocked requests, counted requests, requests verified with Captcha, requests verified by a challenge, etc.. Each metric can be measured with a level of granularity such as by WebACL, rule, country, device, etc... Note that AWS WAF metrics are available in us-east-1 region when the WebACL is applied to CloudFront.
- Shield Advanced emits metrics for detected DDoS attacks, such attack Bits per second, packets per second, and requests per second.
You can create a CloudWatch dashboard based on the above metrics emitted by AWS Edge Services, even if the metrics were across multiple regions and accounts. The below example is a security dahsboard based on metrics emiited by AWS WAF rules.
You can create a CloudWatch dashboard based on the above metrics emitted by AWS Edge Services, even if the metrics were across multiple regions and accounts. The below example is a security dahsboard based on metrics emiited by AWS WAF rules.
Advanced metrics
Advanced metrics for your application can be created in multiple ways. The first is based on combining native CloudWatch metrics into more sophisticated ones, the other ones are based on service logs.
For the first approach, use CloudWatch metric math. For example, you can calculate the total requests per second delivered by CloudFront by divide CloudFront's Requests metric by the period of measurement (m1/PERIOD(m1)). Another example is creating a composite metric that reflects the health of your application by a logical combination of other metrics (e.g. healthy if CloudFront's 5xx < .5% AND Server latency < 1 s). This composite metric can be then used with Shield Advanced's health checks
In the second approach, use logs generated by AWS Edge services to emit custom metrics. Some implementations include:
- Configuring metric filters on logs sent to CloudWatch Logs. For example, you can configure your CloudFront Function to log the occurrences of requests with a certain query string, and use a metric filter to count these occurrences.
- Processing CloudFront and WAF logs sent to Kinesis using Lambda to emit custom metrics. Consider this example implementation.
Alerting
You can create alarms to get notified when CloudWatch metrics indicate an unusual event. Follow the steps in this blog to set up an alarm based on a threshold for 5xx error rate on CloudFront. In addition to alarming based on thresholds, CloudWatch's Anomaly detection allows you to baseline your metrics, and create alarms based on abnormal changes compared to the baseline.
Security findings in Security Hub
AWS Firewall Manager creates findings in AWS Security Hub, for resources that are out of compliance and for detected attacks by Shield Advanced.
CloudWatch Internet Monitor
CloudWatch Internet Monitor provides visibility into the performance of your internet-facing applications, using the connectivity data that AWS captures from its global networking footprint. Internet Monitor provides continuous observability of internet measurements, such as availability and performance, tailored to your workload footprint on AWS. You can use Internet Monitor to get insights into average internet performance metrics over time, and about issues (events) by location and internet service provider (ISP). Using Internet Monitor, you can easily identify which events are impacting end user experience for applications using services like CloudFront. Refer to this blog post on how to monitor internet traffic to CloudFront edge in one click with Amazon CloudWatch Internet Monitor.
Client-side monitoring
In addition to server-side metrics, it's recommended to collect client-side metrics using CloudWatch RUM. RUM gives you the most accurate data about how your web application is behaving from users' perspective. To use CloudWatch RUM, you need to add a javascript tag to your web pages. The javascript collects data from browser APIs, such as page load times, Core Web Vitals, or application errors, and then send them to CloudWatch RUM for dashboarding. In addition, CloudWatch RUM emits CloudWatch metrics such as WebVitalsCumulativeLayoutShift, WebVitalsFirstInputDelay, WebVitalsLargestContentfulPaint, JsErrorCount, HttpStatusCodeCount, etc..