Releases: NVIDIA-NeMo/Evaluator
Releases · NVIDIA-NeMo/Evaluator
NVIDIA Evaluator 0.3.0
Release: NVIDIA Evaluator 0.3.0
NVIDIA NeMo Evaluator Launcher 0.2.6
Changelog Details
- fix(export): add large Gym artifacts to excluded files by @marta-sd :: PR: #907
- fix(nel): fall back to main when version tag not found in skills add by @piojanu :: PR: #906
- fix(launcher): re-raise unrelated ModuleNotFoundError in container_metadata init by @wprazuch :: PR: #909
- feat: add export_mounts and export_image to auto-export config by @agronskiy :: PR: #911
- fix: move auto-export to separate CPU-only sbatch job by @AdamRajfer :: PR: #901
- fix: make SWEbench live progress script restart-safe by @piojanu :: PR: #872
- fix: force base-10 in bash _walltime_to_seconds by @agronskiy :: PR: #916
- docs: remove internal GitLab URL from launching-evals skill by @piojanu :: PR: #920
- feat: improve launching-evals and nel-assistant skills by @piojanu :: PR: #910
- fix(exporters): socket name too long for long hostnames by @prokotg :: PR: #921
- feat(launcher): expose GPUs to eval container for compute-eval by @wprazuch :: PR: #912
- chore(launcher): add exclude_patterns to mlflow exporter (EVAL-632) by @agronskiy :: PR: #940
- feat(mlflow/exporter): default to proxied multipart upload by @agronskiy :: PR: #952
- fix: use docker cp to make sure artifacts are populated in DinD scenarios by @marta-sd :: PR: #1019
- fix: disable xtrace before setting up env vars holding secrets by @marta-sd :: PR: #1021
0.3.0 dev builds (latest: 0.3.0.dev26)
Rolling dev builds from the dev/0.3.0 branch.
Latest: 0.3.0.dev26
Install
pip install https://github.com/NVIDIA-NeMo/Evaluator/releases/download/v0.3.0-dev/nemo_evaluator-0.3.0.dev26-py3-none-any.whlVersioning
| Branch | Version format | Published |
|---|---|---|
dev/0.3.0 |
0.3.0.devN (N = commit count) |
here (GitHub pre-release) |
main |
0.3.0 |
PyPI |
Auto-updated on every push to dev/0.3.0 or manual workflow dispatch. Not for production use.
NVIDIA NeMo Evaluator 0.2.8
Changelog Details
- feat(byob): add completions_logprob endpoint and extend scorers/datasets by @kanishks-23 :: PR: #953
- feat(byob): add explicit few-shot dataset support by @kanishks-23 :: PR: #993
NVIDIA NeMo Evaluator 0.2.7
NVIDIA NeMo Evaluator 0.2.6
Changelog Details
- fix: write BYOB results to per-benchmark subdirectory to avoid data overwriting by @laszkiewiczp :: PR: #856
- fix: use normalized name in BYOB FDF evaluation entry by @laszkiewiczp :: PR: #855
- feat: BYOB add output_parser parameter to judge_score() by @laszkiewiczp :: PR: #859
- fix: byob readme example by @laszkiewiczp :: PR: #854
- fix: remove obsolete run_eval from all by @marta-sd :: PR: #886
- feat(per-sample-score): per sample score by @AWarno :: PR: #888
- feat: replace Werkzeug dev server with waitress for high-concurrency adapter by @agronskiy :: PR: #896
- fix: move logger creation in ProgressTrackingInterceptor to the top by @marta-sd :: PR: #900
- fix(evaluator): distinguish interrupted and failed sigterm exits by @ngoncharenko :: PR: #882
- fix: use poll() and disable IPv6 in waitress adapter server by @agronskiy :: PR: #905
NVIDIA NeMo Evaluator Launcher 0.2.5
Changelog Details
- feat: support arbitrary sbatch flags via sbatch_extra_flags by @gchlebus :: PR: #864
- feat(extra-params): export extra params by @AWarno :: PR: #873
- docs: skill cleanups and fixes by @piojanu :: PR: #878
- docs: add auxiliary deployments example and documentation by @AdamRajfer :: PR: #875
- feat: allow duplicate task names in nel by @laszkiewiczp :: PR: #874
- fix: add missing task_idx arg to TestSbatchExtraFlags by @laszkiewiczp :: PR: #885
- feat: syntactic sugar overrides for tasks by @anowaczynski-nvidia :: PR: #759
- feat: add watch mode for continuous checkpoint evaluation by @marta-sd :: PR: #857
- feat: expose invocation ID as NEL_INVOCATION_ID env var by @agronskiy :: PR: #894
- feat: replace Werkzeug dev server with waitress for high-concurrency adapter by @agronskiy :: PR: #896
- feat: mount results for deployment by @AdamRajfer :: PR: #899
- fix: raise error when execution.env_vars is used in config by @marta-sd :: PR: #898
- fix(evaluator): distinguish interrupted and failed sigterm exits by @ngoncharenko :: PR: #882
NVIDIA NeMo Evaluator Launcher 0.2.4
NVIDIA NeMo Evaluator 0.2.5
NVIDIA NeMo Evaluator Launcher 0.2.3
Changelog Details
- docs(nemotron-3-super): reproducible configs by @prokotg :: PR: #840
- docs(SKILL.md): add ARM64 and non-standard GPU compatibility note by @himorishige :: PR: #818
- fix(deprecated-multiple-instances-flag): fix deprecated multiple instances by @AWarno :: PR: #838
- fix(nel-assistant): correct --model-type to --model_type in SKILL.md by @himorishige :: PR: #813
- feat(malformed-configs-validation): validation of malformed configs by @AWarno :: PR: #811
- fix: fixes for user-reported bugs after 0.2 release by @marta-sd :: PR: #837
- docs(post_cmd): add post_cmd documentation by @e-dobrowolska :: PR: #841
- feat: add configurable health check timeout for local executor by @laszkiewiczp :: PR: #844
- chore: Simplify launcher evaluation templates and skill guidance by @piojanu :: PR: #846
- chore: Remove duplicated skill for byob, add it to readme and marketplace by @wprazuch :: PR: #845
- chore: Update for 26.03 by @wprazuch :: PR: #852
- chore: VLMEvalkit bump by @wprazuch :: PR: #853
- fix: bypass unlisted-task safeguard for local .sqsh by @gchlebus :: PR: #849