How does ipex_llm overcome the limitation that dynamic input cannot be used on NPUs?

When I am currently using NPU, if a certain layer of the model has dynamic input, I need to use the resize method to fix the input in order to use it normally. I don't quite understand how ipex_llm breaks through this limitation (when currently running on NPU devices). Could you please explain how this is achieved? I have considered directly fixing the size of the input layer, but if my target maximum tokens is 8kb, but my initial input may be less than 100b, masking would waste too much computational power. I cannot find such a perfect solution, so I raised this question and look forward to your reply!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does ipex_llm overcome the limitation that dynamic input cannot be used on NPUs? #13333

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How does ipex_llm overcome the limitation that dynamic input cannot be used on NPUs? #13333

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions