Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve egui_kittest screenshot change detector #5773

Open
emilk opened this issue Mar 7, 2025 · 1 comment
Open

Improve egui_kittest screenshot change detector #5773

emilk opened this issue Mar 7, 2025 · 1 comment

Comments

@emilk
Copy link
Owner

emilk commented Mar 7, 2025

Different hardware produce subtly different results:

https://docs.rs/egui_kittest/latest/egui_kittest/#what-do-do-when-ci--another-computer-produces-a-different-image

So we want a screenshot test that ignores small changes but detects large ones.

We should figure out a good default algorithm for this test by building up a test set of pairs of images that should either trigger or not trigger the threshold.

For instance, we recently discovered that these two images fell under our default threshold (i.e. were considered the same) for 98% of all pixels.

Image

Image

The screenshots above came up during the development of rerun-io/rerun#9151
That piece of work transitively broke the test in question and expected a failure but didn't get one.
It is important to note that while this was using egui_kittest's default per-pixel treshholds, it already set a "pixel-ignore-treshold" (as described in #5683) https://github.com/rerun-io/rerun/blob/6001edac74fb39d022eaeac310958a78ac791b4f/crates/viewer/re_view_spatial/tests/visible_time_range.rs#L417

Related to:

@emilk emilk added this to egui Mar 7, 2025
Wumpf added a commit to rerun-io/rerun that referenced this issue Mar 10, 2025
### Related

* Fixes #8557
* Improves test introduced in
#9205
* Note that this test wasn't stable against failure previously 😱, see
emilk/egui#5773 for thoughts about improving
these things in the future

### What

Overrides from blueprint do what they're supposed to again (unclear when
this broke - did it ever work correctly?)

![image](https://github.com/user-attachments/assets/d576f778-60e0-424a-8da5-359a0ede6b1d)


When trying to ensure automated testing for this things started to
ripple out a little bit - commit by commit review recommended.
jprochazk pushed a commit to rerun-io/rerun that referenced this issue Mar 11, 2025
### Related

* Fixes #8557
* Improves test introduced in
#9205
* Note that this test wasn't stable against failure previously 😱, see
emilk/egui#5773 for thoughts about improving
these things in the future

### What

Overrides from blueprint do what they're supposed to again (unclear when
this broke - did it ever work correctly?)

![image](https://github.com/user-attachments/assets/d576f778-60e0-424a-8da5-359a0ede6b1d)


When trying to ensure automated testing for this things started to
ripple out a little bit - commit by commit review recommended.
@lucasmerlin
Copy link
Collaborator

lucasmerlin commented Mar 13, 2025

Maybe we can introduce a per-OS treshold?
Then we define one source OS where the treshold is really low. Ideally the snapshots for this OS would always be software rendered via some docker image (so the source OS would be linux), This way everyone, independent of their actual OS, can update the snapshots.

When testing for the other OSes the treshold could then be overriden in CI via some environment variable?
And in special cases where even more threshold is needed, the test could just set the treshold via some cfg(target = ) thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

2 participants