I need to analyze my Application Load Balancer access logs. How do I do that with Amazon Athena?

By default, Elastic Load Balancing doesn't enable access logging. When you enable access logging, you specify an Amazon Simple Storage Service (Amazon S3) bucket. All Application Load Balancer and Classic Load Balancer access logs are stored in that S3 bucket. Then, when you want to troubleshoot or analyze the performance of your load balancer, you can use Athena to analyze the access logs in S3.

Note: Although you can use Athena to analyze access logs for both Application Load Balancers and Classic Load Balancers, only Application Load Balancers are covered here.  

To analyze access logs in Athena, create a database and table:

1.    Open the Athena console.

2.    In the Query Editor, run a command similar to the following to create a database.
Note: It's a best practice to create the database in the same AWS Region as your S3 bucket.

create database alb_db

3.    In the database that you created in step 2, create a table for the ALB logs.

4.    Under Tables in the left pane, choose Preview table from the menu button that's next to the table name. In the Results window, you should see data from the ALB access logs. This means that you successfully created the Athena table and can now query the ALB access logs.

5.    Use the Query Editor to run SQL statements on the table that you created in step 3. You can save queries, view previous queries, or download query results in CSV format.

Example queries

In the examples below, alb_log is the table name. Modify the table name, column values, and other variables to fit your query.

Action Query

View the first 100 access log entries in chronological order.

Use case: analysis and troubleshooting

SELECT *
FROM alb_log
ORDER by time ASC
LIMIT 100;

List all client IP addresses that accessed the ALB, and how many times they accessed the ALB.

Use case: analysis and troubleshooting

SELECT distinct client_ip, count() as count from alb_log
GROUP by client_ip
ORDER by count() DESC;

List the average amount of data (in kilobytes) that is passing through the ALB in request/response pairs.

Use case: analysis and troubleshooting

SELECT (avg(sent_bytes)/1000.0 + avg(received_bytes)/1000.0)
as prewarm_kilobytes from alb_log;

List all targets that the ALB is routing traffic to and how many times the ALB has routed requests to each target, by percentage distribution.

Use case: identify potential target traffic imbalances

SELECT target_ip, (Count(target_ip)* 100.0 / (Select Count(*) From alb_log))
as backend_traffic_percentage
FROM alb_log
GROUP by target_ip
ORDER By count() DESC;

List the times that a client sent a request to the ALB and then terminated the connection to the ALB before the ALB idle timeout elapsed (HTTP 460 error).

Use case: troubleshoot HTTP 460 errors

SELECT * from alb_log where elb_status_code = '460'

List the times that a client request wasn't routed because the listener rule forwarded the request to an empty target group (HTTP 503 error).

Use case: troubleshoot HTTP 503 errors

SELECT * from alb_log where elb_status_code = '503'

List clients in descending order, by the number of times that each client visited a specified URL.

Use case: analyze traffic patterns

SELECT client_ip, elb, request_url, count(*) as count from alb_log
GROUP by client_ip, elb, request_url
ORDER by count DESC;

List the 10 URLs that Firefox users accessed most frequently, in descending order.

Use case: analyze traffic distribution and patterns

SELECT request_url, user_agent, count(*) as count
FROM alb_log
WHERE user_agent LIKE '%Firefox%'
GROUP by request_url, user_agent
ORDER by count(*) DESC
LIMIT 10;

List clients in descending order, by the amount of data (in megabytes) that each client sent in their requests to the ALB.

Use case: analyze traffic distribution and patterns

SELECT client_ip, sum(received_bytes/1000000.0) as client_datareceived_megabytes
FROM alb_log
GROUP by client_ip
ORDER by client_datareceived_megabytes DESC;

List each time in a specified date range when the target processing time was more than 5 seconds.

Use case: troubleshoot latency in a specified time frame

SELECT * from alb_log
WHERE (parse_datetime(time,'yyyy-MM-dd''T''HH:mm:ss.SSSSSS''Z')
     BETWEEN parse_datetime('2018-08-08-00:00:00','yyyy-MM-dd-HH:mm:ss')
     AND parse_datetime('2018-08-08-02:00:00','yyyy-MM-dd-HH:mm:ss'))
AND (target_processing_time >= 5.0)


Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center

Published: 2018-12-20