-
Notifications
You must be signed in to change notification settings - Fork 64
[pass] Avoid lifting tensors that are too small to initializers #2190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
onnxscript/ir/passes/common/constant_manipulation_test.py:58
- Consider adding test cases that verify the behavior when using the default size_limit (16). It would ensure that tensors with fewer than 16 elements are not lifted.
result = constant_manipulation.LiftConstantsToInitializersPass(lift_all_constants=lift_all_constants, size_limit=0)
onnxscript/ir/passes/common/constant_manipulation.py:121
- [nitpick] Consider using '%d' instead of '%s' in the following logger.debug call to emphasize that size_limit is an integer.
if tensor.size < self.size_limit:
❌ 5 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
Co-authored-by: Shubham Bhokare <[email protected]>
Understand the motivation, but this will create a different behavior to shipped models from GenAI. They lifted all constants with "value". I am not sure whether it's easier to implement for them or they have specific request/process, so they are lifting all tensors. |
I think it is possible to users to tweak the value of the size limit to control this behavior. As part of the general optimization passes I think it makes more sense to maintain the proposed default behavior from this PR. If there are special requirements (we need to understand them better than looking at the current proto processing behavior) we can do (1) change the size limit in ort optimization passes (2) update genai so that it is less constrained |
@kunal-vaishnavi I assume the logic is for pattern matching. Could you share the assumptions the patterns make? What are the patterns? Or is it for different purposes? |
The ORT transformer optimizer applies its graph fusions on initializers instead of |
Is this the only fusion? I can see the example is on model weights/biases so this should already be accounted for. |
This is just one example. Almost all fusions in the ORT transformer optimizer expect initializers instead of |
Thanks. To be on the safe side we can set size_limit to 0 when running the |
We can just apply this pass again with ClearMetadata pass after the export (in builder.py). It's not necessarily to have this change by default if we believe size_limit=16 benefits more in general cases. |
Tensors with too few elements are usually not weights and are plenty. Lifting them will make the initializer list very noisy. I added a parameter
size_limit
to control this and defaulted it to 16.