Golden Test in Flutter

Flutter caches UI images, later comparing pixel by pixel for testing.

Shirsh Shukla
6 min readJan 30, 2024

As a Flutter developer, you probably enjoy using widgets to build your app’s visual puzzle. The process of testing the logic of your code is quite straightforward, but testing how your app looks is quite difficult. The simple test of whether this works is easy, but determining whether your entire screen of buttons and text looks right is tough.

A basic test pattern involves calling the code that you want to test, then running another code to verify if the original code worked as expected. The following is an example of how that might look in Dart.

So, that works great for a lot of operations. But how can we programmatically test the visual appearance of a UI element? This brings us to the first tool many developers reach for, golden tests.

If you would like a quick and easy explanation of coding, please watch this video:

Now, here’s where Golden tests come in. They’re like special tests for how your app looks. Instead of just checking if the code works, Golden tests take a picture of your app’s screen and compare it to a reference picture. These reference pictures, called golden files, are like snapshots of what your app should look like. They’re created by double-checking that everything looks good in your app. So, Golden tests not only make sure your app functions correctly but also that it looks the way it’s supposed to. It’s like having a visual safety net for your app’s user interface.

Golden tests are a pattern where Flutter caches image files for parts of a UI, then later re-renders those UI elements and compares them against the cached image pixel by pixel.

let’s look into this with a simple example of a counter app(we are all familiar with this),

Simple Counter App (main.dart):

In our flutter project inside of test folder, we create golden_widget_test.dart :

as you see in the above image, with the following code, we will run the golden test,

In the Flutter project test folder, a file named golden_widget_test.dart has been created to implement a golden test. This type of test involves comparing the rendered appearance of a widget with a pre-existing “golden” image to ensure that any changes made to the UI do not unintentionally alter the visual output. The golden test is defined within a main function, which serves as the entry point for the test execution.

The test is implemented using the `testWidgets` function, which is a standard testing utility in Flutter for widget testing. The test case itself is named ‘Golden test’ and is asynchronous, indicating that it involves asynchronous operations like pumping the widget and performing image comparisons.

The test begins by pumping the widget using the `tester.pumpWidget(MyApp())` statement. This step involves rendering the specified widget, in this case, an instance of `MyApp`, which is presumably the root widget of the application.

Once the widget is successfully rendered, the golden test proceeds to the comparison phase. The `expectLater` function is used to assert that the rendered widget, identified by `find.byType(MyApp)`, matches a golden image file named ‘counter.png’.

The `matchesGoldenFile` matcher is employed to perform this image-based comparison. This essentially means that the test will pass if the rendered appearance of the widget matches the content of the ‘counter.png’ image file.

So, to create Golden files in Flutter:

  1. To generate images for all Golden tests or update existing ones, use:
flutter test - update-goldens

Look for the `counter.png` file in your project’s test folder and add it to version control.

2. For a specific test, run:

flutter test - update-goldens <path_to_test_file>

3. To verify Golden files, run tests as usual:

flutter test

Here’s a complete sample of the ‘Counter app’ code. You can check what’s different from the initial flutter created. Feel free to clone the project and experiment. For instance, if you change the color or FloatingActionButton and run a flutter test, the test should fail.

Just look at this failure example.

So in this case, four files are generated,

  • counter_isolatedDiff.png
  • counter_maskedDiff.png
  • counter_masterImage.png
  • counter_testImage.png

in counter_isolatedDiff.png : _isolatedDiff.png is a file generated during golden tests when the test fails, and it shows the difference between the expected image and the actual image generated by the widget. It is useful for debugging the test failure and identifying the differences between the expected and actual images.

counter_maskedDiff.png : _maskedDiff.png is another file generated during golden tests when the test fails, and it shows the difference between the expected image and the actual image generated by the widget. The difference is highlighted in red, and the rest of the image is masked out. This file is useful for debugging the test failure and identifying the differences between the expected and actual images.

counter_masterImage.png : _masterImage.png is not a file generated during golden tests when the test fails. Instead, it is the expected image that is used to compare with the actual image generated by the widget during the golden test. The _masterImage.png file is used as a reference to determine whether the widget is generating the correct output.

counter_testImage.png : _testImage.png is not a file generated during golden tests when the test fails. Instead, it is a file that is used to generate the actual image of the widget during the golden test. The _testImage.png file is compared with the _masterImage.png file to determine whether the widget is generating the correct output.

So by this four files you can easily understand where is your mistake in terms of UI.

So, Golden tests are useful for ensuring your widgets have the expected appearance. They don’t have to cover the entire screen, you can create them for specific UI elements. If you modify your UI later on, remember to generate new images for your tests.

In conclusion, Flutter utilizes golden tests to detect regressions and rendering issues. If you’re new to golden tests, consider their benefits, but if you’re already using them, be mindful of certain factors.

Keep in mind that golden tests continuously generate image files, posing storage challenges. While storing them in a Git repository is an option, it’s not ideal for projects with numerous tests. False positives can also arise due to minor pixel differences triggering failures.

Lastly, acknowledge that golden tests are more effective for verifying components than entire screens. Testing a single button, such as we use FloatingActionButton, with a golden test proves more reliable than testing an entire screen with that button.

In summary, golden tests in Flutter offer valuable insights but require careful consideration of storage and potential false positives. Focus on component-level testing for optimal reliability.

If you got something wrong? Mention it in the comments. I would love to improve. your support means a lot to me! If you enjoy the content, I’d be grateful if you could consider subscribing to my YouTube channel as well.

I am Shirsh Shukla, a creative Developer, and a Technology lover. You can find me on LinkedIn or maybe follow me on Twitter or just walk over my portfolio for more details. And of course, you can follow me on GitHub as well.

Have a nice day!🙂

--

--

Shirsh Shukla

SDE at Reliance Jio | Mobile Application Developer | Speaker | Technical Writer | community member at Stack Overflow | Organizer @FlutterIndore