You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 28, 2026. It is now read-only.
When I am currently using NPU, if a certain layer of the model has dynamic input, I need to use the resize method to fix the input in order to use it normally. I don't quite understand how ipex_llm breaks through this limitation (when currently running on NPU devices). Could you please explain how this is achieved? I have considered directly fixing the size of the input layer, but if my target maximum tokens is 8kb, but my initial input may be less than 100b, masking would waste too much computational power. I cannot find such a perfect solution, so I raised this question and look forward to your reply!