support deepseekv4 flash fsdp2 training and reduce Transformers v5 model loading memory pressure#190
Open
meichangsu1 wants to merge 8 commits into
Open
support deepseekv4 flash fsdp2 training and reduce Transformers v5 model loading memory pressure#190meichangsu1 wants to merge 8 commits into
meichangsu1 wants to merge 8 commits into
Commits
Commits on May 12, 2026
- committed
- committed
- committed
- committed
- committed
- committed
- committed