-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Support context manager when toggling optimizers #17294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think this is a very reasonable request :) I'm in favor. The untoggling could be easily forgotten, and with the context manager it is one less thing to think about. |
I'm in favor of the context manager option to toggle and untoggle. I suggest naming it But I don't think it should |
Thank you for your valuable comments @awaelchli @carmocca ! I also don't think it's right to force Therefore, the new design will be: def toggled_optimizer(self, optimizer: Optimizer, auto_zero: bool = True, auto_step: bool = True) -> OptimizerToggle:
return OptimizerToggle(self, optimizer, auto_zero, auto_step)
class OptimizerToggle(object):
def __init__(self, module: LightningModule, optimizer: Optimizer, auto_zero: bool = True, auto_step: bool = True) -> None:
self.module= module
self.optimizer = optimizer
self.auto_zero = auto_zero
self.auto_step = auto_step
def __enter__(self) -> None:
...
self.module.toggle_optimizer(self.optimizer)
# if not need to set the initial grad to zero (the default is `None`)
if self.auto_zero:
self.optimizer.zero_grad()
def __exit__(self, ...) -> None:
...
# if not need any specific closure or argument for optimization
if self.auto_step:
self.optimizer.step()
self.module.untoggle_optimizer(self.optimizer) Reason 1Because, at least in my experience, even though the two methods Reason 2Moreover, even if the context manager is used, Supplementwith self.toggled_optimizer(opt, auto_zero=False, auto_step=False): # ---- (1)
opt.zero_grad(set_to_none=False)
loss = LOSS_COMPUTATION(...)
self.manual_backward(loss) # ---- (2)
opt.step(CLOSURE, **KWARGS) or (if not need any modification for the optimization) with self.toggled_optimizer(opt):
# OptimizerToggle runs `opt.zero_grad()` with `set_to_none=True`
loss = LOSS_COMPUTATION(...)
self.manual_backward(loss) # ---- (2)
# OptimizerToggle runs `opt.step()` without any closure and argument |
Hello! It's already been about two months since I wrote this issue :) |
The context manager usability improvement is a good idea. But I would still not add the optional extra arguments. The reason is that it adds one more different way to do things that only removes two lines of code. It takes away from the expressivity of PyTorch autograd notation. Feel free to submit the PR. Other reviewers might have a different view. |
I agree, the context manager is nice but it should only toggle. It's a nice feature because you can't forget to "untoggle". I think it should not do anything extra beyond that, because in manual optimization, control of step and zero grad is a feature, not a burden. IMO, if automation is desired, one should choose automatic optimization from the start. |
Description & Motivation
TL; DR
Replace this
To this
Description
Since PyTorch Lightning 2.0, manual optimization is possible, typically used for working with multiple optimizers in one LightningModule instance.
However, if you want to use n number of optimizers in order, you will have to continue writing the boilerplate codes:
The above solution makes it difficult to focus on the logic and often leads to mistakes. Fortunately, however, Python offers the
with
statement to prevent this repetition and the mistakes that result from it.For example, when you open a file in Python, you may use:
But you can also use the more simple way:
Likewise, if the boilerplate codes for the optimizer are automatically and implicitly managed in LightningModule, the readability, and quality of the entire code will be much better.
Pitch
Suggestion
There are two major changes in
LightningModule.toggle_optimizer()
:toggle_optimizer()
return context manager that has two magic methods:__enter__
, and__exit__
.toggle_optimizer()
take a boolean parameterzero_grad
(default: True).This is because a model can be linked to multiple optimizers and vice versa.
For such cases, automatic
zero_grad
should be left as an option for the programmer.Solution
Then, we can use the
with
statement as follow:Use case (GAN)
Without suggestion (current approach):
With suggestion (newer approach):
Alternatives
New method
For several reasons, including compatibility, it may be a bad idea to modify the existing functionality of a method.
Therefore, I also propose an alternative approach that calls
toggle_optimizer()
internally.Candidate names for the new method
session
Inspired by
tf.Session()
of TensorFlow v1focus_optimizer
Because focusing on a certain optimizer is what we actually want to do.
switch_optimizer
One of the synonyms of
toggle
Other suggestions are welcomed :)
Additional context
No response
cc @Borda @carmocca @justusschock @awaelchli
The text was updated successfully, but these errors were encountered: