Skip to content
This repository was archived by the owner on Jun 16, 2025. It is now read-only.
This repository was archived by the owner on Jun 16, 2025. It is now read-only.

CUDA error: initialization error in training #48

@lokeaichirou

Description

@lokeaichirou

RuntimeError Traceback (most recent call last)
in ()
1 if name == "main":
----> 2 main(args)

5 frames
in main(args)
69
70 epoch_loss = train(train_epoch, model, optimizer, train_loader,
---> 71 vis, vis_window, args)
72 all_training_losses.append(epoch_loss)
73

in train(epoch, model, optimizer, train_loader, vis, vis_window, args)
7
8 sample_prob = min(args.sample_prob, int(epoch / 5) * 0.05)
----> 9 for train_iter, data in enumerate(train_loader):
10 (img_batch, tempo_seg_pos, tempo_seg_neg, sentence_batch) = data
11 img_batch = Variable(img_batch)

/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in next(self)
515 if self._sampler_iter is None:
516 self._reset()
--> 517 data = self._next_data()
518 self._num_yielded += 1
519 if self._dataset_kind == _DatasetKind.Iterable and \

/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in _next_data(self)
1197 else:
1198 del self._task_info[idx]
-> 1199 return self._process_data(data)
1200
1201 def _try_put_index(self):

/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in _process_data(self, data)
1223 self._try_put_index()
1224 if isinstance(data, ExceptionWrapper):
-> 1225 data.reraise()
1226 return data
1227

/usr/local/lib/python3.7/dist-packages/torch/_utils.py in reraise(self)
427 # have message field
428 raise self.exc_type(message=msg)
--> 429 raise self.exc_type(msg)
430
431

RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/content/anet_dataset.py", line 514, in anet_collate_fn
sentence_batch[batch_idx] = sentence.data
RuntimeError: CUDA error: initialization error

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions