Time IDs in SDXL training #9459
Replies: 3 comments
-
cc: @sayakpaul |
Beta Was this translation helpful? Give feedback.
-
Important to me, because I'm training a classifier on these latents, for classifier guidance, but I can only train it on crops (because i don't have full data). Then I need to be sure that when I apply the classifier to the whole image, time ids will be the same as in the crop if at the same place in the image. |
Beta Was this translation helpful? Give feedback.
-
These are micro-conditions that were introduced in the SDXL paper. You could read more about them in the Section 2.2 of the paper: https://arxiv.org/abs/2307.01952. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I was trying to understand these time ids in SDXL training script. It's an array for each example, that could be [1024, 1034, 0, 10, 1024, 1024]. It's directly fed into Timesteps forward. I'm just wondering what's happening here, how each of these integers original_height, original_width, crop_coord_top, crop_coord_left, target_resolution, target_resolution get a kind of exponential embedding... and what is the expected output of this, there is something I'm missing, thanks for your help !
Beta Was this translation helpful? Give feedback.
All reactions