Releases · NVIDIA-NeMo/Evaluator

03 Jun 16:12

nemo-automation-bot

v0.3.0

d2f7b17

NVIDIA Evaluator 0.3.0 Latest

Latest

Release: NVIDIA Evaluator 0.3.0

Assets 2

21 May 11:54

svcnvidia-nemo-ci

nemo-evaluator-launcher-v0.2.6

ab491cb

NVIDIA NeMo Evaluator Launcher 0.2.6

Changelog Details

fix(export): add large Gym artifacts to excluded files by @marta-sd :: PR: #907
fix(nel): fall back to main when version tag not found in skills add by @piojanu :: PR: #906
fix(launcher): re-raise unrelated ModuleNotFoundError in container_metadata init by @wprazuch :: PR: #909
feat: add export_mounts and export_image to auto-export config by @agronskiy :: PR: #911
fix: move auto-export to separate CPU-only sbatch job by @AdamRajfer :: PR: #901
fix: make SWEbench live progress script restart-safe by @piojanu :: PR: #872
fix: force base-10 in bash _walltime_to_seconds by @agronskiy :: PR: #916
docs: remove internal GitLab URL from launching-evals skill by @piojanu :: PR: #920
feat: improve launching-evals and nel-assistant skills by @piojanu :: PR: #910
fix(exporters): socket name too long for long hostnames by @prokotg :: PR: #921
feat(launcher): expose GPUs to eval container for compute-eval by @wprazuch :: PR: #912
chore(launcher): add exclude_patterns to mlflow exporter (EVAL-632) by @agronskiy :: PR: #940
feat(mlflow/exporter): default to proxied multipart upload by @agronskiy :: PR: #952
fix: use docker cp to make sure artifacts are populated in DinD scenarios by @marta-sd :: PR: #1019
fix: disable xtrace before setting up env vars holding secrets by @marta-sd :: PR: #1021

Contributors

marta-sd, agronskiy, and 4 other contributors

Assets 2

13 May 15:09

github-actions

v0.3.0-dev

7a44667

0.3.0 dev builds (latest: 0.3.0.dev26) Pre-release

Pre-release

Rolling dev builds from the dev/0.3.0 branch.

Latest: 0.3.0.dev26

Install

pip install https://github.com/NVIDIA-NeMo/Evaluator/releases/download/v0.3.0-dev/nemo_evaluator-0.3.0.dev26-py3-none-any.whl

Versioning

Branch	Version format	Published
`dev/0.3.0`	`0.3.0.devN` (N = commit count)	here (GitHub pre-release)
`main`	`0.3.0`	PyPI

Auto-updated on every push to dev/0.3.0 or manual workflow dispatch. Not for production use.

Assets 3

08 May 06:51

svcnvidia-nemo-ci

nemo-evaluator-v0.2.8

76c865c

NVIDIA NeMo Evaluator 0.2.8

Changelog Details

feat(byob): add completions_logprob endpoint and extend scorers/datasets by @kanishks-23 :: PR: #953
feat(byob): add explicit few-shot dataset support by @kanishks-23 :: PR: #993

Contributors

kanishks-23

Assets 2

29 Apr 15:07

svcnvidia-nemo-ci

nemo-evaluator-v0.2.7

b93f1f4

NVIDIA NeMo Evaluator 0.2.7

Changelog Details

feat: [EVAL-878] allow custom HTTP headers in payload_modifier by @wprazuch :: PR: #945

Contributors

wprazuch

Assets 2

16 Apr 09:49

svcnvidia-nemo-ci

nemo-evaluator-v0.2.6

cb5e2f8

NVIDIA NeMo Evaluator 0.2.6

Changelog Details

fix: write BYOB results to per-benchmark subdirectory to avoid data overwriting by @laszkiewiczp :: PR: #856
fix: use normalized name in BYOB FDF evaluation entry by @laszkiewiczp :: PR: #855
feat: BYOB add output_parser parameter to judge_score() by @laszkiewiczp :: PR: #859
fix: byob readme example by @laszkiewiczp :: PR: #854
fix: remove obsolete run_eval from all by @marta-sd :: PR: #886
feat(per-sample-score): per sample score by @AWarno :: PR: #888
feat: replace Werkzeug dev server with waitress for high-concurrency adapter by @agronskiy :: PR: #896
fix: move logger creation in ProgressTrackingInterceptor to the top by @marta-sd :: PR: #900
fix(evaluator): distinguish interrupted and failed sigterm exits by @ngoncharenko :: PR: #882
fix: use poll() and disable IPv6 in waitress adapter server by @agronskiy :: PR: #905

Contributors

marta-sd, ngoncharenko, and 3 other contributors

Assets 2

16 Apr 09:49

svcnvidia-nemo-ci

nemo-evaluator-launcher-v0.2.5

cb5e2f8

NVIDIA NeMo Evaluator Launcher 0.2.5

Changelog Details

feat: support arbitrary sbatch flags via sbatch_extra_flags by @gchlebus :: PR: #864
feat(extra-params): export extra params by @AWarno :: PR: #873
docs: skill cleanups and fixes by @piojanu :: PR: #878
docs: add auxiliary deployments example and documentation by @AdamRajfer :: PR: #875
feat: allow duplicate task names in nel by @laszkiewiczp :: PR: #874
fix: add missing task_idx arg to TestSbatchExtraFlags by @laszkiewiczp :: PR: #885
feat: syntactic sugar overrides for tasks by @anowaczynski-nvidia :: PR: #759
feat: add watch mode for continuous checkpoint evaluation by @marta-sd :: PR: #857
feat: expose invocation ID as NEL_INVOCATION_ID env var by @agronskiy :: PR: #894
feat: replace Werkzeug dev server with waitress for high-concurrency adapter by @agronskiy :: PR: #896
feat: mount results for deployment by @AdamRajfer :: PR: #899
fix: raise error when execution.env_vars is used in config by @marta-sd :: PR: #898
fix(evaluator): distinguish interrupted and failed sigterm exits by @ngoncharenko :: PR: #882

Contributors

gchlebus, marta-sd, and 7 other contributors

Assets 2

19 Mar 08:32

svcnvidia-nemo-ci

nemo-evaluator-launcher-v0.2.4

26f45ea

NVIDIA NeMo Evaluator Launcher 0.2.4

Changelog Details

feat: deploy auxiliary endpoints by @wprazuch :: PR: #830
feat: add launching-evals and accessing-mlflow skills by @piojanu :: PR: #865
feat: rename to nel skills add and add marketplace entries by @piojanu :: PR: #868

Contributors

piojanu and wprazuch

Assets 2

18 Mar 01:35

svcnvidia-nemo-ci

nemo-evaluator-v0.2.5

a8c6072

NVIDIA NeMo Evaluator 0.2.5

Changelog Details

feat: add --platform flag for BYOB container builds by @laszkiewiczp :: PR: #832
chore: Remove duplicated skill for byob, add it to readme and marketplace by @wprazuch :: PR: #845
fix: remove deprecated api_key field from ApiEndpoint by @gchlebus :: PR: #850

Contributors

gchlebus, wprazuch, and laszkiewiczp

Assets 2

18 Mar 01:35

svcnvidia-nemo-ci

nemo-evaluator-launcher-v0.2.3

a8c6072

NVIDIA NeMo Evaluator Launcher 0.2.3

Changelog Details

docs(nemotron-3-super): reproducible configs by @prokotg :: PR: #840
docs(SKILL.md): add ARM64 and non-standard GPU compatibility note by @himorishige :: PR: #818
fix(deprecated-multiple-instances-flag): fix deprecated multiple instances by @AWarno :: PR: #838
fix(nel-assistant): correct --model-type to --model_type in SKILL.md by @himorishige :: PR: #813
feat(malformed-configs-validation): validation of malformed configs by @AWarno :: PR: #811
fix: fixes for user-reported bugs after 0.2 release by @marta-sd :: PR: #837
docs(post_cmd): add post_cmd documentation by @e-dobrowolska :: PR: #841
feat: add configurable health check timeout for local executor by @laszkiewiczp :: PR: #844
chore: Simplify launcher evaluation templates and skill guidance by @piojanu :: PR: #846
chore: Remove duplicated skill for byob, add it to readme and marketplace by @wprazuch :: PR: #845
chore: Update for 26.03 by @wprazuch :: PR: #852
chore: VLMEvalkit bump by @wprazuch :: PR: #853
fix: bypass unlisted-task safeguard for local .sqsh by @gchlebus :: PR: #849

Contributors

gchlebus, marta-sd, and 7 other contributors

Assets 2

Releases: NVIDIA-NeMo/Evaluator

NVIDIA Evaluator 0.3.0

Uh oh!

NVIDIA NeMo Evaluator Launcher 0.2.6

Contributors

Uh oh!

0.3.0 dev builds (latest: 0.3.0.dev26)

Install

Versioning

Uh oh!

NVIDIA NeMo Evaluator 0.2.8

Contributors

Uh oh!

NVIDIA NeMo Evaluator 0.2.7

Contributors

Uh oh!

NVIDIA NeMo Evaluator 0.2.6

Contributors

Uh oh!

NVIDIA NeMo Evaluator Launcher 0.2.5

Contributors

Uh oh!

NVIDIA NeMo Evaluator Launcher 0.2.4

Contributors

Uh oh!

NVIDIA NeMo Evaluator 0.2.5

Contributors

Uh oh!

NVIDIA NeMo Evaluator Launcher 0.2.3

Contributors

Uh oh!