You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to run train.py directly in slurm environment and got this error message.
File "train2.py", line 852, in
main()
File "train2.py", line 643, in main
amp_autocast=amp_autocast, loss_scaler=loss_scaler, model_ema=model_ema, mixup_fn=mixup_fn)
File "train2.py", line 711, in train_one_epoch
output = model(input)
File "/projects/academic/wjzheng/xliu79/anaconda3/envs/ptimm/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/projects/academic/wjzheng/xliu79/pytorch-image-models/timm/models/efficientnet.py", line 557, in forward
x = self.forward_features(x)
File "/projects/academic/wjzheng/xliu79/pytorch-image-models/timm/models/efficientnet.py", line 540, in forward_features
x = self.conv_stem(x)
File "/projects/academic/wjzheng/xliu79/anaconda3/envs/ptimm/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/projects/academic/wjzheng/xliu79/anaconda3/envs/ptimm/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 447, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/projects/academic/wjzheng/xliu79/anaconda3/envs/ptimm/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 444, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument weight in method wrapper___slow_conv2d_forward)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I tried to run train.py directly in slurm environment and got this error message.
File "train2.py", line 852, in
main()
File "train2.py", line 643, in main
amp_autocast=amp_autocast, loss_scaler=loss_scaler, model_ema=model_ema, mixup_fn=mixup_fn)
File "train2.py", line 711, in train_one_epoch
output = model(input)
File "/projects/academic/wjzheng/xliu79/anaconda3/envs/ptimm/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/projects/academic/wjzheng/xliu79/pytorch-image-models/timm/models/efficientnet.py", line 557, in forward
x = self.forward_features(x)
File "/projects/academic/wjzheng/xliu79/pytorch-image-models/timm/models/efficientnet.py", line 540, in forward_features
x = self.conv_stem(x)
File "/projects/academic/wjzheng/xliu79/anaconda3/envs/ptimm/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/projects/academic/wjzheng/xliu79/anaconda3/envs/ptimm/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 447, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/projects/academic/wjzheng/xliu79/anaconda3/envs/ptimm/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 444, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument weight in method wrapper___slow_conv2d_forward)
Can someone help?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions