Skip to content

[train] Add sum loss reduction mode and fix test file for existing code

efc37ec
Select commit
Loading
Failed to load commit list.
Draft

[train] Add DRO (Direct Reward Optimization) policy loss #1259

[train] Add sum loss reduction mode and fix test file for existing code
efc37ec
Select commit
Loading
Failed to load commit list.