-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After upgrading from 8.6 to 10.8 or 10.9, tensorrt's results are inconsistent with onnxrt #4400
Comments
@2730gf Your output distribution is very similar, can you upload the full screenshot ? |
@lix19937 Thanks for your reply. I have updated the screenshot. |
So from your sshot, the max dist in one place, and mainly dist in [0, 0.0316], different arch of gpu, the tactic of kernel impl is diffenrent, BTW, you can evaluate through indicators. |
@lix19937 This model is fp32, and generally speaking, using fp32 inference will not produce such a large diff. Most importantly, it has already affected the accuracy of the model. That's why we start to analyze the output diff of the intermediate nodes |
Try to open BTW, you can use polygraphy to location which layer begin to diff. |
@lix19937 |
Sorry, I thought you were using trtexec. |
@lix19937 Hello, it is not trained using torch, but it is indeed a transformer-based model. Is there any way to fix this accuracy issue? |
Description
After upgrading tensorrt to 10.8, the model accuracy decreased.
After setting all nodes of the model to output, the model accuracy was aligned. It was suspected that the fusion strategy introduced by the upgrade caused the accuracy problem.
Finally, we used the polygraphy tool to find an onnx subgraph that could reproduce the problem.
By the way, it needs to be reminded that the model only uses fp32 precision, not even fp16 or int8
Environment
TensorRT Version: 10.8 & 10.9
NVIDIA GPU: 3090
NVIDIA Driver Version: 550.67
CUDA Version: 12.2
Relevant Files
I post onnx file here:
Model link: https://github.com/2730gf/issues/blob/main/trt_inconsistent/mini_graph.onnx
Steps To Reproduce
Commands or scripts:

polygraphy run mini_graph.onnx -v -v -v -v -v --pool-limit workspace:20G --onnxrt --trt --validate --atol 1e-4 --rtol 1e-3 --onnx-outputs p2o.Concat.125 --trt-outputs p2o.Concat.125
The text was updated successfully, but these errors were encountered: