-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Propose fix perceptual loss sqrt nan #8414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Propose fix perceptual loss sqrt nan #8414
Conversation
Thanks for the update, the changes looks fine to me. |
def normalize_tensor(x: torch.Tensor, eps: float = 1e-8) -> torch.Tensor: | ||
norm_factor = torch.sqrt(torch.sum(x**2, dim=1, keepdim=True) + eps) | ||
return x / (norm_factor + eps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def normalize_tensor(x: torch.Tensor, eps: float = 1e-8) -> torch.Tensor: | |
norm_factor = torch.sqrt(torch.sum(x**2, dim=1, keepdim=True) + eps) | |
return x / (norm_factor + eps) | |
def normalize_tensor(x: torch.Tensor, eps: float = 1e-8) -> torch.Tensor: | |
norm_factor = torch.sqrt(torch.sum(x**2, dim=1, keepdim=True) + eps) | |
return x / norm_factor |
Do we want to remove eps
from the denominator? As proposed eps
will contribute twice to the final result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Will remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should go into an appropriate subdirectory in the tests
directory. We've changed the directory structure there recently so probably tests/losses
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roger.
# Create tensor | ||
x = torch.zeros(2, 3, 10, 10, requires_grad=True) | ||
|
||
optimizer = optim.Adam([x], lr=0.01) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the optimizer is needed for this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed, will remove.
x = torch.zeros(2, 3, 10, 10, requires_grad=True) | ||
|
||
optimizer = optim.Adam([x], lr=0.01) | ||
x_scaled = x * scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since x
is all 0, x_scaled
is always going to be 0 unless you're expected float imprecision to create values here. If so, I would add a comment to mention this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the point of this test with regards to the next one; instead of a zeros tensor, couldn't it be a random one which will be then multiplied by a really small number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me. I'd modify one of the tests, but the rest is fine.
x = torch.zeros(2, 3, 10, 10, requires_grad=True) | ||
|
||
optimizer = optim.Adam([x], lr=0.01) | ||
x_scaled = x * scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the point of this test with regards to the next one; instead of a zeros tensor, couldn't it be a random one which will be then multiplied by a really small number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides my comment about the point of one of the tests, I think this PR can be merged, as long as the errors happening on the automatic tests are fixed.
Fixes # 8412
Description
This PR fixes a numerical stability issue in the PerceptualLoss implementation where the
normalize_tensor
function can produce NaN gradients when the input values are very small.Types of changes
./runtests.sh -f -u --net --coverage
../runtests.sh --quick --unittests --disttests
.