AWS DevOps Blog

Automating code reviews and application profiling with Amazon CodeGuru

Amazon CodeGuru is a machine learning-based service released during re:Invent 2019 for automated code reviews and application performance recommendations. CodeGuru equips the development teams with the tools to maintain a high bar for coding standards in their software development process.

CodeGuru Reviewer helps developers avoid introducing issues that are difficult to detect, troubleshoot, reproduce, and root-cause. It also enables them to improve application performance. This not only improves the reliability of the software, but also cuts down the time spent chasing difficult issues like race conditions, slow resource leaks, thread safety issues, use of un-sanitized inputs, inappropriate handling of sensitive data, and application performance impact, to name a few.

CodeGuru is powered by machine learning, best practices, and hard-learned lessons across millions of code reviews and thousands of applications profiled on open source projects and internally at Amazon.

The service leverages machine-learning abilities to provide following two functionalities:

a) Reviewer: provides automated code reviews for static code analysis

b) Profiler: provides visibility into and recommendations about application performance during runtime

This blog post provides a short workshop to get a feel for both the above functionalities.

 

Solution overview

The following diagram illustrates a typical developer workflow in which the CodeGuru service is used in the code-review stage and the application performance-monitoring stage. The code reviewer is used for static code analysis backed with trained machine-learning models, and the profiler is used to monitor application performance when the code artifact is deployed and executed on the target compute.

Development Workflow

 

The following diagram depicts additional details to show the big picture in the overall schema of the CodeGuru workflow:

Big Picture Development Workflow
This blog workshop automates the deployment of a sample application from a GitHub link via an AWS CloudFormation template, including the dependencies needed. It also demonstrates the Reviewer functionality.

 

Pre-requisites

Follow these steps to get set up:

1. Set up your AWS Cloud9 environment and access the bash terminal, preferably in the us-east-1 region.

2. Ensure you have an individual GitHub account.

3. Configure an Amazon EC2 key-pair (preferably in the us-east-1 region) and make the .pem file available from the terminal being used.

 

CodeGuru Reviewer

This section demonstrates how to configure CodeGuru Reviewer functionality and associate it with the GitHub repository. Execute the following configuration steps:

Step 1: Fork the GitHub repository
First, log in to your GitHub account and navigate to this sample code. Choose Fork and wait for it to create a fork in your account, as shown in the following screenshot.

Figure shows how to fork a repository from GitHub

 

Step 2: Associate the GitHub repository
Log in to the CodeGuru dashboard and follow these steps:

1. Choose Reviewer from the left panel and choose Associate repository.

2. Choose GitHub and then choose Connect to GitHub.

3. Once authenticated and connection made, you can select the repository aws-codeguru-profiler-sample-application from the Repository location drop-down list and choose Associate, as shown in the following screenshot.

Associate repository with CodeGuru service

This associates the CodeGuru Reviewer with the specified repository and continues to listen for any pull-request events.

Associated repository with CodeGuru service

Step 3: Prepare your code
From your AWS Cloud9 terminal, clone the repository, create a new branch, using the following example commands:

git clone https://github.com/<your-userid>/aws-codeguru-profiler-sample-application.git
cd aws-codeguru-profiler-sample-application
git branch dev
git checkout dev
cd src/main/java/com/company/sample/application/

Open the file: CreateOrderThread.java and goto the line 63. Below line 63 which adds an order entry, insert the if statement under the comment to introduce an order entry check. Please indent the lines with spaces so they are well aligned as shown below.

SalesSystem.orders.put(orderDate, order);
//Check if the Order entered and present
if (SalesSystem.orders.containsKey(orderDate)) {
        System.out.println("New order verified to be present in hashmap: " + SalesSystem.orders.get(orderDate)); 
}
id++;
 
       

Once the above changes are introduced in the file, save and commit it to git and push it to the Repository.

git add .
git commit -s -m "Introducing new code that is potentially thread unsafe and inefficient"
cd ../../../../../../../
ls src/main/java/com/company/sample/application/

Now, upload the new branch to the GitHub repository using the following commands. Enter your credentials when asked to authenticate against your GitHub account:

git status
git push --set-upstream origin dev
ls

Step 4: Create a Pull request on GitHub:
In your GitHub account, you should see a new branch: dev.

1. Go to your GitHub account and choose the Pull requests tab.

2. Select New pull request.

3. Under Comparing Changes, select <userid>/aws-codeguru-profiler-sample-application as the source branch.

4. Select the options from the two drop-down lists that selects a merge proposal from the dev branch to the master branch, as shown in the following screenshot.

5. Review the code diffs shown. It should say that the diffs are mergeable (able to merge). Choose Create Pull request to complete the process.

Creating Pull request

This sends a Pull request notification to the CodeGuru service and is reflected on the CodeGuru dashboard, as shown in the following screenshot.

CodeGuru Dashboard

After a short time, a set of recommendations appears on the same GitHub page on which the Pull request was created.

The demo profiler configuration and recommendations shown on the dashboard are provided by default as a sample application profile. See the profiler section of this post for further discussion.

The following screenshot shows a recommendation generated about potential thread concurrency susceptibility:

 

CodeGuru Recommendations on GitHub

 

Another example below to show how the developer can provide feedback about recommendations using emojis:

How to provide feedback for CodeGuru recommendations

As you can see from the recommendations, not only are the code issues detected, but a detailed recommendation is also provided on how to fix the issues, along with a link to examples, and documentation, wherever applicable. For each of the recommendations, a developer can give feedback about whether the recommendation was useful or not with a simple emoji selection under Pick your reaction.

Please note that the CodeGuru service is used to identify difficult-to-find functional defects and not syntactical errors. Syntax errors should be flagged by the IDE and addressed at an early stage of development. CodeGuru is introduced at a later stage in a developer workflow, when the code is already developed, unit-tested, and ready for code-review.

 

CodeGuru Profiler

CodeGuru Profiler functionality focuses on searching for application performance optimizations, identifying your most “expensive” lines of code that take unnecessarily long times or higher-than-expected CPU cycles, for which there is a better/faster/cheaper alternative. It generates recommendations with actions you can take in order to reduce your CPU use, lower your compute costs, and overall improve your application’s performance. The profiler simplifies the troubleshooting and exploration of the application’s runtime behavior using visualizations. Examples of such issues include excessive recreation of expensive objects, expensive deserialization, usage of inefficient libraries, and excessive logging.

This post provides two sample application Demo profiles by default in the profiler section to demonstrate the visualization of CPU and latency characteristics of those profiles. This offers a quick and easy way to check the profiler output without onboarding an application. Additionally, there are recommendations shown for the {CodeGuru} DemoProfilingGroup-WithIssues application profile. However, if you would like to run a proof-of-concept with real application, please follow the procedure below.

The following steps launch a sample application on Amazon EC2 and configure Profiler to monitor the application performance from the CodeGuru service.

Step 1: Create a profiling group
Follow these steps to create a profiling group:

1. From the CodeGuru dashboard, choose Profiler from the left panel.

2. Under Profiling groups, select Create profiling group and type the name of your group. This workshop uses the name DemoProfilingGroup.

3. After typing the name, choose Create in the bottom right corner.

The output page shows you instructions on how to onboard the CodeGuru Profiler Agent library into your application, along with the necessary permissions required to successfully submit data to CodeGuru. This workshop uses the AWS CloudFormation template to automate the onboarding configuration and launch Amazon EC2 with the application and its dependencies.

Step 2: Run AWS Cloudformation to launch Amazon EC2 with the Java application:
This example runs an AWS CloudFormation template that does all the undifferentiated heavy lifting of launching an Amazon EC2 machine and installing JDK, Maven, and the sample demo application.

Once done, it configures the application to use a profiling group named DemoProfilingGroup, compiles the application, and executes it as a background process. This results in the sample demo application running in the region you choose, and submits profiling data to the CodeGuru Profiler Service under the DemoProfilingGroup profiling group created in the previous step.

To launch the AWS CloudFormation template that deploys the demo application, choose the following Launch Stack button, and fill in the Stack name, Key-pair name, and Profiling Group name.

Launch Button

Once the AWS CloudFormation deployment succeeds, log in to your terminal of choice and use ssh to connect to the Amazon EC2 machine. Check the running application using the following commands:

ssh -i '<path-to-keypair.pem-file>' ec2-user@<ec2-ip-address>
java -version
mvn -v
ps -ef | grep SalesSystem  => This is the java application running in the background
tail /var/log/cloud-init-output.log  => You should see the following output in 10-15 minutes as INFO: Successfully reported profile

Once the CodeGuru agent is imported into the application, a separate profiler thread spawns when the application runs. It samples the application CPU and Latency characteristics and delivers them to the backend Profiler service for building the application profile.

Step 3: Check the Profiler flame-graphs:
Wait for 10-15 minutes for your profiling-group to become active (if not already) and for profiling data to be submitted and aggregated by the CodeGuru Profiler service.

Visit the Profiling Groups page and choose DemoProfilingGroup. You should see the following page showing your application’s profiling data in a visualization called a flame-graph, as shown in the screenshot below. Detailed explanation about flame-graphs and how to read them follow.

Profiler flame-graph visualization

Profiler extensively uses flame-graph visualizations to display your application’s profiling data since they’re a great way to convey issues once you understand how to read them.

The x-axis shows the stack profile population (collection of stack traces) sorted alphabetically (not chronologically), and the y-axis shows stack depth, counting from zero at the bottom. Each rectangle represents a stack frame. The wider a frame is is, the more often it was present in the stacks. The top edge shows what is on CPU, and beneath it is its ancestry. The colors are usually not significant (they’re picked randomly to differentiate frames).

As shown in the preceding screenshot, the stack traces for the three threads are shown, which are triggered by the code in the SalesSystem.java file.

1) createOrderThread.run

2) createIllegalOrderThread.run

3) listOrderThread.run

The flame-graph also depicts the stack depth and points out specific function names when you hover over that block. The marked areas in the flame-graph highlight the top block functions on-CPU and spikes in stack-trace. This may indicate an opportunity to optimize.

It is evident from the preceding diagram that significant CPU time is being used by an exception stack trace (leftmost). It’s also highlighted in the recommendation report as described in Step 4 below.

The exception is caused by trying to instantiate an Enum class giving it invalid String values. If you review the file CreateIllegalOrderThread.java, you should notice the constructors being called with illegal product names, which are defined in ProductName.java.

Step 4: Profiler Recommendations:
Apart from the real-time visualization of application performance described in the preceding section, a recommendation report (generated after a period of time) may appear, pointing out suspected inefficiencies to fix to improve the application performance. Once the recommendation appears, select the Recommendation link to see the details.

Each section in the Recommendations report can be expanded in order to get instructions on how to resolve the issue, or to examine several locations in which there were issues in your data, as shown in the following screenshot.

CodeGuru profiler recommendations

In the preceding example, the report includes an issue named Checking for Values not in enum, in which it conveys that more time (15.4%) was spent processing exceptions than expected (less than 1%). The reason for the exceptions is described in Step 3 and the resolution recommendations are provided in the report.

 

CodeGuru supportability:

CodeGuru currently supports native Java-based applications for the Reviewer and Profiler functionality. The Reviewer functionality currently supports AWS CodeCommit and all cloud-hosted non-enterprise versions of GitHub products, including Free/Pro/Team, as code repositories.

Amazon CodeGuru Profiler does not have any code repository dependence and works with Java applications hosted on Amazon EC2, containerized applications running on Amazon ECS and Amazon EKS, serverless applications running on AWS Fargate, and on-premises hosts with adequate AWS credentials.

 

Cleanup

At the end of this workshop, once the testing is completed, follow these steps to disable the service to avoid incurring any further charges.

1. Reviewer: Remove the association of the CodeGuru service to the repository, so that any further Pull-request notifications don’t trigger the CodeGuru service to perform an automated code-review.

2. Profiler: Remove the profiling group.

3. Amazon EC2 Compute: Go to the Amazon EC2 service, select the CodeGuru EC2 machine, and select the option to terminate the Amazon EC2 compute.

 

Conclusion

This post reviewed the CodeGuru service and implemented code examples for the Reviewer and Profiler functionalities. It described Reviewer functionality providing automated code-reviews and detailed guidance on fixes. The Profiler functionality enabled you to visualize your real-time application stack for granular inspection and generate recommendations that provided guidance on performance improvements.

I hope this post was informative and enabled you to onboard and test the service, as well as to leverage this service in your application development workflow.

 

About the Author

Author Photo

 

Nikunj Vaidya is a Sr. Solutions Architect with Amazon Web Services, focusing in the area of DevOps services. He builds technical content for the field enablement and offers technical guidance to the customers on AWS DevOps solutions and services that would streamline the application development process, accelerate application delivery, and enable maintaining a high bar of software quality.