Amazon CodeGuru Reviewer is trained using rule mining and supervised machine learning models that use a combination of logistic regression and neural networks.
For example, during training for deviation from AWS best practices, Amazon CodeGuru Reviewer mines Amazon code bases using search techniques and locality sensitive models for pull requests that include AWS API calls. It looks at code changes intended to improve the quality of the code, and cross-references them against documentation data. The result is the creation of a new set of rules that Reviewer recommends to you as best practices when it reviews your code.
During training for resource and sensitive data leaks, it does a full code analysis for all code paths that use the resource or sensitive data, creates a feature set representing those, and then uses those as inputs for logistic regression models and convolutional neural networks (CNNs).
For code inconsistencies, the models are trained during either the full or incremental code review. After a customer triggers a review, these models utilize a number of data mining and machine learning techniques to build the dataset, highlight the reason for the code patterns, and make recommendations customized to the customer’s code.
For both rule-based and machine learning-based models, Amazon CodeGuru Reviewer uses the feedback you provide as labels and iteratively improves the quality of code detectors.