I want to troubleshoot performance bottlenecks within my Amazon EC2 Linux instances. What advanced tools can I use with EC2Rescue for Linux to do that?

Last updated: 2020-05-08

I want to troubleshoot performance bottlenecks within my Amazon Elastic Compute Cloud (Amazon EC2) Linux instances. What tools can I use with EC2Rescue for Linux to do that?

Short Description

Performance bottlenecks on Amazon EC2 Linux instances can occur in CPU performance, block I/O performance, or network performance. To determine where performance bottlenecks are happening, you can leverage the 33 tools available in EC2Rescue for Linux using the bcc framework in eBPF (extended Berkeley Packet Filter). eBPF efficiently and securely runs monitoring tools in production environments without significant performance overhead.

Resolution

(For experienced Linux system administrators)

Install the bcc package for your operating system.

1.    Connect to your instance using SSH.

2.    Install the bcc package. For download and installation instructions for distributions other than Amazon Linux, refer to the documentation specific to your distribution. For Amazon Linux instances, use the following command:

$ sudo yum install bcc

3.    The bcc tools must be in the PATH variable on your operating system for EC2 Rescue for Linux to run them. Use the following command to put the tools in the PATH variable:

$ sudo -s
# export PATH=$PATH:/usr/share/bcc/tools/

4.    It's a best practice to permanently add the PATH setting to your Linux system. The steps to make this setting permanent vary depending on your specific Linux distribution. For Amazon Linux, use the following commands:

Open ~/.bash_profile using the vi editor:

# vi ~/.bash_profile

Append /usr/share/bcc/tools to the PATH variable:

PATH=$PATH:$HOME/bin:/usr/share/bcc/tools

Save the file and exit the vi editor.

Source the updated profile:

#source ~/.bash_profile

6.    Download and install the EC2 Rescue for Linux tool, and then navigate to the installation directory on your instance.

The following are commonly used bcc-based modules used with EC2Rescue for Linux.

CPU performance tools

bccsoftirqs.yaml - This module executes the softirqs tool that traces soft interrupts (IRQs), and then stores timing statistics in-kernel for efficiency. An interval can be provided using --period, and a count using --times argument. The tool automatically prints the timestamps for each execution. For more information, see EC2Rescue for Linux - bccsoftirqs.yaml on the GitHub website.

bccrunqlat.yaml - This program shows how long tasks have spent waiting their turn to run on-CPU. Results are shown as a histogram. For more information, see EC2Rescue for Linux - bccrunqlat.yaml on the GitHub website.

# ./ec2rl run --only-modules=bccsoftirqs,bccrunqlat --period=5 --times=5

Block I/O performance tools

bccbiolatency.yaml - Traces block device I/O, and records the distribution of I/O latency (time) per disk device, such as an instance store and Amazon Elastic Block Store (Amazon EBS), attached to your EC2 instance. Results are printed as a histogram. The module runs for the specified period and collects output a specified number of times. In the example below, the period and times variables are set to 5. For more information, see EC2Rescue for Linux - bccbiolatency.yaml on the GitHub website.

bccext4slower.yaml - Collects output using the ext4slower tool. ext4slower traces any ext4 reads, writes, opens, and fsyncs that are slower than a threshold of 10 ms by default. The module runs for the specified period and collects output a specified number of times. In the example below, the period and times variables are set to 5. For more information, see EC2Rescue for Linux - bccext4slower.yaml on the GitHub website.

You can use the bccxfsslower module similarly to bccext4slower.yaml for XFS file systems. For more information, see EC2Rescue for Linux - bccxfsslower.yaml on the GitHub website.

bccfileslower.yaml - Collects output using fileslower that traces file-based synchronous reads and writes slower than a default threshold of 10 ms. The module runs for the specified period and then collects output a specified number of times. In the example below, the period and times variables are set to 5. For more information, see EC2Rescue for Linux - bccfileslower.yaml on the GitHub website.

# ./ec2rl run --only-modules=bccbiolatency,bccext4slower,bccfileslower --period=5 --times=5

Network performance tools

bcctcpconnlat.yaml - Traces the kernel function performing active TCP connections (for example, through a connect() syscall). The results display the latency (time) for the connection. Latency is measured locally, meaning the time from SYN sent to the response packet for a specified period. TCP connection latency indicates the time taken to establish a connection. For more information, see EC2Rescue for Linux - bcctcpconnlat.yaml on the GitHub website.

bcctcptop.yaml - Displays TCP connection throughput per host and port for the specified period and times without clearing the screen. For more information, see EC2Rescue for Linux - bcctcptop.yaml on the GitHub website.

bcctcplife.yaml - Summarizes TCP sessions that open and close while tracing. For more information, see EC2Rescue for Linux - bcctcplife.yaml on the GitHub website.

# ./ec2rl run --only-modules=bcctcpconnlat,bcctcptop,bcctcplife --period=5 --times=5

Output example

The results of running these modules are located under the /var/tmp/ec2rl directory after each single run of one or more modules on your instance.

The following example is the output from the bcctcptop module with the period parameter set to 5 and the times parameter set to 2:

# ./ec2rl run --only-modules=bcctcptop --period=5 --times=2
# cat /var/tmp/ec2rl/2020-04-20T21_50_01.177374/mod_out/run/bcctcptop.log 
I will collect tcptop output from this alami box 2 times.
Tracing... Output every 5 secs. Hit Ctrl-C to end
21:50:17 loadavg: 0.74 0.33 0.17 5/244 4285
PID    COMM         LADDR                 RADDR                  RX_KB  TX_KB
3989   sshd         172.31.22.238:22      72.21.196.67:26601         0      9
21:50:22 loadavg: 0.84 0.36 0.18 4/244 4285
PID    COMM         LADDR                 RADDR                  RX_KB  TX_KB
3989   sshd         172.31.22.238:22      72.21.196.67:26601         0     11
2731   amazon-ssm-a 172.31.22.238:54348   52.94.225.236:443          5      4
2938   amazon-ssm-a 172.31.22.238:58878   52.119.197.249:443         0      0

You can upload results to AWS Support using the following command:

# ./ec2rl upload --upload-directory=/var/tmp/ec2rl/2020-04-20T21_50_01.177374 --support-url="URLProvidedByAWSSupport"

Note: The quotation marks in the preceding command are required. If you run the tool with sudo, upload the results using sudo. Run the command help upload for details on using an Amazon Simple Storage Service (Amazon S3) presigned URL to upload the output.


Did this article help you?

Anything we could improve?


Need more help?