[Dev UI] Allow comparison over different versions of a dataset #2606

shrutip90 · 2025-04-02T21:32:42Z

Overview

Datasets have a new version created every time there is any change in the dataset. Most commonly this would be adding a new test case. Right now, evals comparison feature allow comparison only between the same versions of a dataset, but this is too restrictive as a single test case changes also makes the evals runs un-comparable. Can we allow comparison over different versions?

For supporting this, we could do an outer join in the eval results from multiple runs, but we need to figure out how to display changes in the Dataset input and reference fields:

Input and reference are taken from the baseline results currently. If the comparison runs have modified inputs / additional testcases, how do we surface them?

Designs

TBD

shrutip90 added the devui label Apr 2, 2025

shrutip90 added this to Genkit Backlog Apr 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dev UI] Allow comparison over different versions of a dataset #2606

[Dev UI] Allow comparison over different versions of a dataset #2606

shrutip90 commented Apr 2, 2025

[Dev UI] Allow comparison over different versions of a dataset #2606

[Dev UI] Allow comparison over different versions of a dataset #2606

Comments

shrutip90 commented Apr 2, 2025

Overview

Designs