Skip to content

support deepseekv4 flash fsdp2 training and reduce Transformers v5 model loading memory pressure#190

Open
meichangsu1 wants to merge 8 commits into
modelscope:mainfrom
meichangsu1:dsv4_fsdp2_ljl
Open

support deepseekv4 flash fsdp2 training and reduce Transformers v5 model loading memory pressure#190
meichangsu1 wants to merge 8 commits into
modelscope:mainfrom
meichangsu1:dsv4_fsdp2_ljl

Commits

Commits on May 12, 2026

Commits on May 13, 2026