Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dev UI] Allow comparison over different versions of a dataset #2606

Open
shrutip90 opened this issue Apr 2, 2025 · 0 comments
Open

[Dev UI] Allow comparison over different versions of a dataset #2606

shrutip90 opened this issue Apr 2, 2025 · 0 comments
Labels

Comments

@shrutip90
Copy link
Contributor

Overview

Datasets have a new version created every time there is any change in the dataset. Most commonly this would be adding a new test case. Right now, evals comparison feature allow comparison only between the same versions of a dataset, but this is too restrictive as a single test case changes also makes the evals runs un-comparable. Can we allow comparison over different versions?

For supporting this, we could do an outer join in the eval results from multiple runs, but we need to figure out how to display changes in the Dataset input and reference fields:
Image

Input and reference are taken from the baseline results currently. If the comparison runs have modified inputs / additional testcases, how do we surface them?

Designs

TBD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

1 participant