Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

断点继续训练数据加载问题-IterableDataset的问题 #964

Open
Zxr1314 opened this issue Mar 23, 2025 · 1 comment
Open

断点继续训练数据加载问题-IterableDataset的问题 #964

Zxr1314 opened this issue Mar 23, 2025 · 1 comment

Comments

@Zxr1314
Copy link

Zxr1314 commented Mar 23, 2025

这个部分目前是否支持保存数据的state_dict呢,或者未来有计划支持么,不然训练过程中中断可能会导致数据重复训练(影响性能)/从头过一遍数据再训练(耗费时间)

@yuecao0119
Copy link
Collaborator

你好,

当前代码支持每一步保存模型权重,可以参考参数save_steps。并且支持resume_ckpt,可以参考训练代码1097行
这是Trainer实现的功能,会从对应step继续向后训练,代码seed前后一致可以避免数据重复训练。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants