Logging metrics at datapoint level #16573
Unanswered
GregorySech
asked this question in
code help: CV
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm trying to log some metrics at the datapoint (image/dataset row/sample) level.
For example, considering a semantic segmentation task I would like to log for each image its intersection over union and loss value.
What I've done so far is use the LightningModule.log method and assign a log key for each image:
This method is then called by both training_step_end and validation_step_end with an appropriate
stage
string.However, I've run into an unexpected issue, after 50 steps from the first log I find the same value logged on Tensorboard.
I should add that I'm using DDP, however, this behaviour happens regardless of how many devices. The screenshot refers to a run with a single GPU.
The Dataset is implemented using the filename as the "image_id" so there are no duplicates (and the number of images does add up to the correct dataset size).
I was wondering if this behaviour is to be expected and if there is a smarter way of logging at this level of granularity.
Beta Was this translation helpful? Give feedback.
All reactions