Open
Description
Description & Motivation
Lightning does a val check before training, this is a really good feature.
But I found sometimes when the checkpoint callback saves the model, it will oom. (Some frameworks will generate more vram like deepspeed).
This will waste a lot of time if the save interval was set very long.
If we call the checkpoint callback in the first batch, we can find the oom problem earlier and save more time.
Pitch
No response
Alternatives
No response
Additional context
No response
cc @Borda @awaelchli