Skip to content

fix: align correct-sample rewards with DP-local lengths#1900

Open
miamia0 wants to merge 2 commits into
THUDM:mainfrom
miamia0:fix/correct-sample-raw-reward-alignment
Open

fix: align correct-sample rewards with DP-local lengths#1900
miamia0 wants to merge 2 commits into
THUDM:mainfrom
miamia0:fix/correct-sample-raw-reward-alignment

Commits

Commits on May 10, 2026

Commits on May 11, 2026