Hi,
I have noticed that in the following loss functions, the loss_decoder only updates the decoder but does not update the quantizer, because a .detach() operation is applied before entering the decoder.
|
data_shuffle = viewer.shuffle(data) |
|
quant_shuffle = quantizer(data_shuffle)["x_quant"] |
|
quant = viewer.unshuffle(quant_shuffle) |
|
data_rec = decoder(quant.detach()) |
|
loss_quant = F.mse_loss(data_shuffle, quant_shuffle) |
|
loss_decoder = F.mse_loss(data, data_rec) |
|
loss = loss_quant + loss_decoder |
May I ask why you design it this way, rather than using the decoder loss to update the decoder and quantizer altogether (i.e., without the .detach() operation before entering the decoder)?
Hi,
I have noticed that in the following loss functions, the
loss_decoderonly updates the decoder but does not update the quantizer, because a.detach()operation is applied before entering the decoder.ReVQ/scripts/train.py
Lines 133 to 139 in 0d6634d
May I ask why you design it this way, rather than using the decoder loss to update the decoder and quantizer altogether (i.e., without the
.detach()operation before entering the decoder)?