Skip to content

GRPO的时候怎么保存最后一步的checkpoints #4574

Open
@pureoxygen123

Description

@pureoxygen123

我在训练GRPO的时候只会按照save steps的倍数,有没有方法能够把最后一步的checkpoints也同样保存

Train:  99%|█████████▉| 80/81 [4:49:11<03:45, 225.62s/it]
Train:  99%|█████████▉| 80/81 [4:49:11<03:45, 225.62s/it]
Train:  99%|█████████▉| 80/81 [4:49:11<03:36, 216.89s/it]
[INFO:swift] last_model_checkpoint: /home/output_checkpoints/grpo/v6-20250609-104706/checkpoint-80
[INFO:swift] best_model_checkpoint: None

可能相关的配置参数
"save_strategy": "steps",
"save_steps": 20.0,
"save_total_limit": 2,
"save_safetensors": true,
"save_on_each_node": false,
"save_only_model": false,

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions