[WIP] Support export of Llama with DynamicCache and transformers>=4.51 #24379

xadupre · 2025-04-10T11:56:32Z

Description

Replaces #24291.

transformers>=4.51 makes DynamicCache exportable.
The modification were tested with a tiny LLM:

python -m onnxruntime.transformers.models.llama.convert_to_onnx -m arnir0/Tiny-LLM --output Tiny-LLM --precision fp16 --execution_provider cuda --small_gp --use_dynamo_export

…llama2

onnxruntime/python/tools/transformers/models/llama/llama_inputs.py

@@ -7,6 +7,7 @@

 import numpy as np
 import torch
+import transformers


onnxruntime/python/tools/transformers/models/llama/llama_parity.py

 import torch
+import transformers


onnxruntime/python/tools/transformers/models/torch_export_patches/__init__.py

+        return {torch_deepcopy(v) for v in value}
+    if isinstance(value, dict):
+        return {k: torch_deepcopy(v) for k, v in value.items()}
+    if isinstance(value, np.ndarray):


xadupre added 24 commits March 28, 2025 15:01

first draft to migrate to newer version of transformers

5453405

add patches

31e82a9

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

299f116

…llama2

fix import

cdec2d0

fix build and import

827d3bd

build

18b649e

fix lint

0e77ed4

lint

4633a3e

lint

b12287a

rename

1b926cb

lint

6646e61

lint

a14b8b3

remove args.dynamo

9f3a816

fix issues

0c88e42

copy inputs

8b60535

fix shape

741285b

fix validation

f8490a5

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

dbe202c

…llama2

add use_dynamo_export

ca43041

lint

19d4dfb

remove unncessary changes

cee2181

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

ebc7b77

…llama2

simplification

1a9847f

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

c7d80a3

…llama2

github-advanced-security bot found potential problems Apr 10, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/torch_export_patches/__init__.py Fixed Show fixed Hide fixed

modif

0ba3d14

github-advanced-security bot found potential problems Apr 10, 2025

View reviewed changes

xadupre closed this Apr 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Support export of Llama with DynamicCache and transformers>=4.51 #24379

[WIP] Support export of Llama with DynamicCache and transformers>=4.51 #24379

xadupre commented Apr 10, 2025

Check notice

Check notice

Check failure

[WIP] Support export of Llama with DynamicCache and transformers>=4.51 #24379

[WIP] Support export of Llama with DynamicCache and transformers>=4.51 #24379

Conversation

xadupre commented Apr 10, 2025

Description

Description

Check notice

Check notice

Check failure