You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am observing that the learning rate suddenly dropped to model.optim.sched.min_lr after model.optim.sched.warmup_steps. I am using CosineAnnealing, where I am expecting the learning rate will gradually drop to min_lr after warmup steps instead of suddenly dropping.
I am expecting the learning rate will gradually drop to min_lr after warmup steps instead of suddenly dropping. If I am doing it the wrong way, what should be the correct way of making this possible?
Environment overview (please complete the following information)
PyTorch version 2.3
Python version 3.10
The text was updated successfully, but these errors were encountered:
Looking herehttps://github.com/NVIDIA/NeMo/blob/main/nemo/core/optim/lr_scheduler.py#L353 you may need to set decay_steps (If I'm looking in the correct place). It looks like during the warmup_steps the learning rate linearly ramps to max_lr, then decays to min_lr during decay_steps. Curious if that works for you!
Describe the bug
Hi, I am observing that the learning rate suddenly dropped to
model.optim.sched.min_lr
aftermodel.optim.sched.warmup_steps
. I am usingCosineAnnealing
, where I am expecting the learning rate will gradually drop to min_lr after warmup steps instead of suddenly dropping.Steps/Code to reproduce bug
Expected behavior
I am expecting the learning rate will gradually drop to min_lr after warmup steps instead of suddenly dropping. If I am doing it the wrong way, what should be the correct way of making this possible?
Environment overview (please complete the following information)
The text was updated successfully, but these errors were encountered: