Commit 3dc98de
authored
Add test coverage for Muon muon_lr/adam_lr overrides (#8047)
## Summary
Add coverage for separate learning rate overrides in the Muon optimizer
path and fix the related Muon blog documentation.
## Background
Muon parameters and non-Muon parameters are automatically split into
separate optimizer groups. The intended behavior is:
- `muon_lr` applies to Muon parameter groups
- `adam_lr` applies to Adam parameter groups
- `lr` remains the fallback for both groups when overrides are not
provided
## Changes
- add a parameterized test covering:
- legacy `lr` fallback behavior
- separate `muon_lr` / `adam_lr` override behavior
- fix the Muon blog table header to label `muon_lr` and `adam_lr`
correctly
## Validation
Ran:
`python -m pytest
DeepSpeed/tests/unit/ops/muon/test_muon_partial_training.py -k
learning_rate_overrides -q -rs`
Result:
- test collected successfully
- skipped locally because this distributed test requires 2 GPUs, while
the local environment has 1 GPU
---------
Signed-off-by: Sowndappan S <147894621+sowndappan5@users.noreply.github.com>1 parent 28a196f commit 3dc98de
1 file changed
Lines changed: 41 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
| |||
173 | 174 | | |
174 | 175 | | |
175 | 176 | | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
0 commit comments