Skip to content

Cuda out of memory in validation #1432

Open
@ChenDirk

Description

@ChenDirk

I use 8 GPUS to train one model. During the training, the cuda occupied about 19K MiB (24k total), but in validation, it needs more than 24k MiB, out of memory and stop training. In training, the random crop size is 512,512, and in validation and test, the data will be resize to 512,512 (keep_ratio=False)。At the begining, I think it's due to the softmax layer in inference, because the number of class is very large(194). So I remove it, but do not fix the problem. Can you tell me other possible reasons for the problem? Thanks a lot!

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions