Using empty_init
results in 0 gradient
#19720
Unanswered
RuABraun
asked this question in
DDP / multi-GPU / multi-node
Replies: 1 comment
-
@RuABraun We encountered the same problem when running a model with only one GPU ( |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
My code looks like
This results in all parameters having 0 gradient. This changes when I remove the
init_module
line.Guessing I'm using it wrong, should I be wrapping everything model related in it?
Beta Was this translation helpful? Give feedback.
All reactions