Arctic: Automated Desktop Application Testing

The Amazon Corretto team delivers more than 75 OpenJDK bundles for various platforms and Java versions. These builds include the AWT and Swing UI libraries. Since a button needs to look like a button after a Java update, and even the smallest change can have a significant impact on how graphical elements are rendered, we need to validate each release to ensure we do not introduce regressions.

Validating interactive desktop applications as part of an automated build pipeline is difficult. Although solutions exist, they are limited in scope and usually require tests to be specifically written to support the automation. Adapting existing tests can be a time consuming process, and in some scenarios may not be possible. Falling back to manual verification is slow and does not scale.

Arctic supports existing tests intended to be run manually, and is agnostic about how tests were written. It can be used to validate any type of UI test, as it does not require any special application side support. Arctic relies on the operating system to capture all required events at recording time and then reproduce them during replay time. Arctic can thus operate with older tests that were not written with automation in mind, without the need to modify them.

Arctic runs on Linux, Windows, and macOS on x86_64 systems and on Linux and macOS on aarch64 systems.

How does Arctic work?

First, a test is run manually with Arctic in “recording mode” to capture a baseline set of screenshots. During recording, all keyboard and mouse events, as well as screenshots, are saved whenever instructed. Later, the recording is used to replay the original sequence of mouse and keyboard events while recording a new set of screenshots. Arctic then compares the two sets of screenshots to validate that they are the same.

A distinguishing Arctic feature is support for scenarios where a perfect pixel match is not possible. For example, a special window called “the workbench” is drawn in the background by Arctic at recording time. This window determines to which area of the screen Arctic will pay attention. Other windows called “shades” can be used to hide parts of the test that are not relevant or may appear randomly during execution. In this way, Arctic can focus on the specific area of the screen where we want to match previous and current screenshots. In addition, Arctic supports multiple image comparison mechanisms. A single isolated pixel or a slight color deviation should not be considered a failure, as a human operator could probably not detect such differences. Arctic can differentiate between such minor fluctuations and material differences.

Even with these features, Arctic may report a test failure that a human would consider acceptable. Arctic includes a way to review failed image comparisons, and those that are considered valid can be added as approved alternatives to be used in the next iteration of tests. Future runs will then compare current screenshots against available alternatives.

Other Arctic features are:

Configurable image comparators for screenshots
Session persistence and restoration to review results later on a different platform
Automatic removal of redundant events for test playback
Test playback speed control to run tests faster than originally recorded
Configurable logging
Overlays to help humans locate differences between two images

How do I start using Arctic?

Arctic configuration:

Arctic is a Java application, so it requires a JDK. Arctic needs at least JDK 11 to run, but we recommend using JDK 21.

Arctic can be configured using the recorder.properties and player.properties files, which are read from the current working directory. If one or both are not present, sample versions will be written. Documentation for each property key is included in the corresponding file.

Test environment configuration:

As Arctic relies on pixel comparison to validate the different results, changes to the test environment may cause Arctic to report false positives. Some of the common settings that can cause problems are

Desktop background
Screen resolution
UI theme
Installed fonts

NOTE: The demo app you will see in the following section was prepared on MacOS.

Download Arctic

Arctic binaries are available, but for this demo we are going to download source and build from scratch.

Arctic binaries
Arctic source code

Arctic control keys

Because Arctic captures all keyboard and mouse events, during the recording we need a way to tell Arctic what we want it to do without interfering with the test. This can be done by pressing specific key combinations. When several modifier keys are pressed (such as ctrl or alt) Arctic will interpret the next key presses as an instruction. Among other uses, these instructions can be to start or stop a recording or to capture a screenshot.

The keys are configured in the recorder.properties file with properties that start with arctic.recorder.control.jnh. Which keycodes correspond to different keys can be checked by running: java -jar arctic-<VERSION>.jar -k. This command will print the keycode of the different keys as they are pressed. By default, ctrl+alt+z (ctrl+option+z) starts/stops a recording and ctrl+alt+x (ctrl+option+x) captures a screenshot.

Recording a test

Before running a test, start Arctic in recording mode by executing: java -jar arctic-<VERSION>.jar -r.

Once you see the Workbench and Shade windows, you can start running a test case. Before starting the recording, make sure the Workbench window covers the entirety of the screen that will be relevant during test execution. The shade window can be used to hide parts of the screen.

Additionally, we need to tell Arctic which test is on the screen. Arctic identifies tests using testName and testCase string values, which can be supplied by running
java -jar arctic-<VERSION>.jar -c test start <testName> <testCase>
There are other ways for Arctic to be notified which test is on the screen. For example, Arctic can be integrated with your test framework. Arctic includes support for jtHarness, so it can get the current test’s name from there.

Next, we start the recording and manually interact with it.

Every time a user interaction causes a change to the UI that we consider relevant, we take a screenshot.

Once we are done with the test and close the window, we can stop recording. Capturing the act of closing the test as part of the recording is optional, but will leave the environment ready for the next recording.

Screenshots and recordings are saved in the tests folder, which can be configured by both recorder.properties and player.properties. But, they need to match:

# Defines where the test suite we record will be stored.
arctic.common.testPath = ../tests

Replay a test

Once a recording is finished, Arctic is ready to replay the test. Launch Arctic in player mode with: java -jar arctic-<VERSION>.jar -p.

Once Arctic has loaded, start the relevant test and then instruct Arctic to run it using java -jar arctic-<VERSION>.jar -c test start <testName> <testCase>. Arctic will reproduce the keyboard and mouse events and capture screenshots at the same time as in the recording. Once the test is finished, you can inform Arctic using java -jar arctic-<VERSION>.jar test finish <testName> <testCase> <result>. You can use the result field to tell Arctic whether anything has gone wrong on the test side, for example, based on the test return code. If result is true and all screenshots captured during the test are considered good enough, Arctic will mark the test as ok. If any of the screenshots were different or result is false, Arctic will mark the test as failed.

Demo video

Getting Arctic results

Once you have run all your tests, you can export your results: Arctic supports both junit and tap file formats. Exclusion files for jtHarness can also be generated which contain a list of all the tests that failed. These files can be generated with java -jar arctic-<VERSION>.jar <format> save <filename>. Valid formats are:

xml: junit xml report
tap: TAP version 13
jtx: jtHarness exclusion file

Review screenshot comparisons

For multiple reasons, tests may not pass the first time after recording. The results can be reviewed by using the Arctic command sc. For example, java -jar arctic-<VERSION>.jar -c sc all will start a review of all the screenshots that failed in the current session. This can help you identify why the recording failed, and also add the current screenshot as an alternative to be compared in future runs.

What’s next? Where can I learn more and how can I get involved?

The Amazon Corretto team appreciates any feedback and questions about Arctic, and looks forward to hearing from you to help us refine our road map. Please visit our GitHub repository at Arctic github repository to learn more about Arctic, report problems, and contribute.

Acknowledgement

We would like to acknowledge the following open source projects that made Arctic possible:

jNativeHook
Apache Commons Configuration
Google Gson
Google Guice
Junit
Mockito
Slf4j
Lombok
Checkstyle
Gradle

About the author:

Select your cookie preferences

AWS Developer Tools Blog