Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

[Impeller] Adds the ability to specify a golden threshold #40824

Merged
merged 6 commits into from
Apr 3, 2023

Conversation

gaaclarke
Copy link
Member

reverts #40818

Pre-launch Checklist

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I read the Tree Hygiene wiki page, which explains my responsibilities.
  • I read and followed the Flutter Style Guide and the C++, Objective-C, Java style guides.
  • I listed at least one issue that this PR fixes in the description above.
  • I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See testing the engine for instructions on writing and running engine tests.
  • I updated/added relevant documentation (doc comments with ///).
  • I signed the CLA.
  • All existing and new tests are passing.

If you need help, consider asking for advice on the #hackers-new channel on Discord.

@gaaclarke gaaclarke requested a review from dnfield March 31, 2023 18:55
@gaaclarke
Copy link
Member Author

fyi @Piinks @chinmaygarde

Copy link
Member

@chinmaygarde chinmaygarde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want thresholds, we want to follow the Skia GPU teams guidelines on having the Gold results be tackled out of band.

@gaaclarke
Copy link
Member Author

We don't want thresholds, we want to follow the Skia GPU teams guidelines on having the Gold results be tackled out of band.

I thought those things are mutually exclusive? No matter how failures are triaged we don't want this test to be tripping up all the time, right?

@flutter-dashboard
Copy link

Golden file changes have been found for this pull request. Click here to view and triage (e.g. because this is an intentional change).

If you are still iterating on this change and are not ready to resolve the images on the Flutter Gold dashboard, consider marking this PR as a draft pull request above. You will still be able to view image results on the dashboard, commenting will be silenced, and the check will not try to resolve itself until marked ready for review.

Changes reported for pull request #40824 at sha bcc52d2

@gaaclarke
Copy link
Member Author

I'm actually a bit confused by SkiaGold's dashboard, the max diff pixels is 7864, yet the different number of pixels is reported as 24. Shouldn't this not have been flagged?

@Piinks
Copy link
Contributor

Piinks commented Mar 31, 2023

cc @mdebbar

the max diff pixels is 7864, yet the different number of pixels is reported as 24. Shouldn't this not have been flagged?

I am not familiar with how the engine is configured for Gold.

@chinmaygarde
Copy link
Member

No matter how failures are triaged we don't want this test to be tripping up all the time, right?

The understanding is that each gold test has multiple valid variants. As long as a triager confirms that two variants are sufficiently alike, the test will pass unless an entirely new variant is detected.

@gaaclarke
Copy link
Member Author

gaaclarke commented Mar 31, 2023

The understanding is that each gold test has multiple valid variants. As long as a triager confirms that two variants are sufficiently alike, the test will pass unless an entirely new variant is detected.

Ahh yea, I've heard mention of such a feature. Who can help us with that?

@mdebbar
Copy link
Contributor

mdebbar commented Apr 3, 2023

The way the fuzzy matcher works in Gold is by doing two checks:

  1. How many pixels are different between baseline image and test image? This is controlled by the differentPixelsRate parameter that you are using in this PR. So you are good here.

  2. How much is each of these pixels allowed to differ from its corresponding pixel in the baseline image? This is controlled by the pixelColorDelta parameter. The default is 0 and you are overriding this in your PR.

What's happening is your image is passing the 1st condition but it's failing on the 2nd.

@gaaclarke
Copy link
Member Author

@chinmaygarde there was a lot of talk about this in different channels. I'll try to summarize it:

  1. Skia Gold already can work against multiple goldens
  2. We want to make failures not block the roll (outlined in The SkiaGold check should not block rolls into flutter/engine flutter#123979)
  3. The number of accepted golden variations is limited to around the number of devices running

I think we still want a tiny bit of fuzzing in addition to the other mechanisms. This PR makes it so all the tests pass if less than 1% of pixels are different by less than 4 color component deltas. (I increased the color delta for the rotated text to 40). PTAL, let me know if you want to discuss this further and we can try to suss everything out.

@gaaclarke gaaclarke requested a review from chinmaygarde April 3, 2023 18:24
Copy link
Member

@chinmaygarde chinmaygarde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still want a tiny bit of fuzzing in addition to the other mechanisms...

I think that is totally reasonable. But I'd rather not have the tests get into the business of determining the deltas like in the current version of the patch.

If there is a global fuzz value you are comfortable with, let's set it in the test harness. If not, we can leave it out altogether. I'd rather not have to think about how to figure out maxDiffPixelsPercent and max_color_delta for each screenshot. That is cognitive overhead we don't want to add every time we write a test. Especially if the process is going to be semi-manual anyway.

@gaaclarke
Copy link
Member Author

If there is a global fuzz value you are comfortable with, let's set it in the test harness.

Okay, I switched it up so the whole test_runner has the same values across the whole tests.

@gaaclarke gaaclarke requested a review from chinmaygarde April 3, 2023 21:03
@flutter-dashboard
Copy link

Golden file changes are available for triage from new commit, Click here to view.

Changes reported for pull request #40824 at sha d0cf61d

@gaaclarke
Copy link
Member Author

I'm a bit confused about what Skia gold is doing. The color delta is clearly set to 8 but the skia gold report has it set to zero:

Screenshot 2023-04-03 at 3 47 54 PM

Let's land this and keep an eye on it. It may be some latent state or inability to change the fuzzy threshold after the fact, we'll see.

@gaaclarke gaaclarke added the autosubmit Merge PR when tree becomes green via auto submit App label Apr 3, 2023
@auto-submit auto-submit bot merged commit 6a6d8cc into flutter:main Apr 3, 2023
@Piinks
Copy link
Contributor

Piinks commented Apr 5, 2023

This does not appear to be working as intended. The correct config for fuzzy matching is being received by gold, but it looks like this image is just flaky. @camsim99 came across it in #40924 and looking at the digest, we can see the image has a different color dot for every commit. This means it is not producing a consistent image, and is blocking PRs from landing.

https://flutter-engine-gold.skia.org/search?issue=40924&crs=github&patchsets=3&corpus=flutter-engine

Screenshot 2023-04-05 at 4 48 25 PM

image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
autosubmit Merge PR when tree becomes green via auto submit App e: impeller will affect goldens
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants