Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HF model tracker #899

Open
pdhirajkumarprasad opened this issue Jan 9, 2025 · 6 comments
Open

HF model tracker #899

pdhirajkumarprasad opened this issue Jan 9, 2025 · 6 comments

Comments

@pdhirajkumarprasad
Copy link

pdhirajkumarprasad commented Jan 9, 2025

Total no. of models 545
PASS 307 -> 408
Numeric 12 -> 37
compilation
compiled_inference
setup and import

Detailed list

@amd-vivekag
Copy link

amd-vivekag commented Feb 13, 2025

Passing Summary

TOTAL TESTS = 544

Stage # Passing % of Total % of Attempted
Setup 532 97.8% 97.8%
IREE Compilation 457 84.0% 85.9%
Gold Inference 451 82.9% 98.7%
IREE Inference Invocation 445 81.8% 98.7%
Inference Comparison (PASS) 406 74.6% 91.2%

Fail Summary

TOTAL TESTS = 544

Stage # Failed at Stage % of Total
Setup 12 2.2%
IREE Compilation 75 13.8%
Gold Inference 6 1.1%
IREE Inference Invocation 6 1.1%
Inference Comparison 39 7.2%

GIST containing all the failures: https://gist.github.com/amd-vivekag/377a7b141b40c118f880b2ced176f95c

Following issues failing in CPU:

# Issue type Issue Message Issue no #Model impacted List of model Assignee Status
1 setup ImportError("Loading an AWQ quantized model requires auto-awq library (pip install autoawq) 918 2 hf_Midnight-Miqu-70B-v1.5-4bit, hf_Meta-Llama-3.1-8B-Instruct-AWQ-INT4
3 setup IndexError: index out of range in self 920 1 hf_ruRoPEBert-e5-base-2k
5 setup importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes 922 1 hf_Meta-Llama-3.1-8B-Instruct-bnb-4bit
6 setup RuntimeError: Error(s) in loading state_dict for DebertaV2ForMultipleChoice: 923 1 hf_fine-tuned-MoritzLaurer-deberta-v3-large-zeroshot-v2.0-arceasy
7 setup TypeError: DisableCompileContextManager.enter....() got an unexpected keyword argument 'dtype' 924 1 hf_Llama3-8B-1.58-100B-tokens-GGUF
8 setup torch.onnx.errors.UnsupportedOperatorError: Exporting the operator 'aten::bitwise_and' to ONNX opset version 14 is not supported 925 1 hf_Mistral-7B-Instruct-v0.2-GPTQ
12 import_model Assertion node->outputs().size() < 4 failed #929 1 hf_nfnet_l0.ra2_in1k
13 compilation error: failed to legalize operation 'torch.operator' that was explicitly marked illegal (onnx.If return type issue) #930 45 hf_1_microsoft_deberta_V1.0, hf_1_microsoft_deberta_V1.1, hf_checkpoints_10_1_microsoft_deberta_V1.1_384, hf_checkpoints_1_16, hf_checkpoints_26_9_microsoft_deberta_21_9, hf_checkpoints_28_9_microsoft_deberta_V2, hf_checkpoints_28_9_microsoft_deberta_V4, hf_checkpoints_28_9_microsoft_deberta_V5, hf_checkpoints_29_9_microsoft_deberta_V1, hf_checkpoints_30_9_microsoft_deberta_V1.0_384, hf_checkpoints_3_14, hf_content, hf_deberta-base, hf_deberta_finetuned_pii, hf_deberta-large-mnli, hf_Debertalarg_model_multichoice_Version2, hf_deberta-v2-base-japanese, hf_deberta-v2-base-japanese-char-wwm, hf_deberta-v3-base, hf_deberta-v3-base-absa-v1.1, hf_deberta-v3-base_finetuned_ai4privacy_v2, hf_deberta-v3-base-injection, hf_DeBERTa-v3-base-mnli-fever-anli, hf_deberta-v3-base-squad2, hf_deberta-v3-base-zeroshot-v1.1-all-33, hf_deberta-v3-large, hf_deberta-v3-large_boolq, hf_deberta-v3-large-squad2, hf_deberta-v3-large_test, hf_deberta-v3-large_test_9e-6, hf_deberta-v3-small, hf_deberta-v3-xsmall, hf_llm-mdeberta-v3-swag, hf_mdeberta-v3-base, hf_mDeBERTa-v3-base-mnli-xnli, hf_mdeberta-v3-base-squad2, hf_mDeBERTa-v3-xnli-ft-bs-multiple-choice, hf_Medical-NER, hf_mxbai-rerank-base-v1, hf_mxbai-rerank-xsmall-v1, hf_nli-deberta-v3-base, hf_output, hf_piiranha-v1-detect-personal-information, hf_splinter-base, hf_splinter-base-qass
14 compilation error: failed to legalize unresolved materialization from ('i64') to ('index') that remained live after conversion iree-org/iree#18899 3 hf_deeplabv3-mobilevit-small, hf_deeplabv3-mobilevit-xx-small, hf_mobilevit-small
15 compilation error: 'flow.dispatch.workgroups' op value set has 3 dynamic dimensions but only 2 dimension values are attached iree-org/iree#20154 3 hf_beit-base-patch16-224-pt22k, hf_beit-base-patch16-224-pt22k-ft22k, hf_pedestrian_gender_recognition
16 compilation error: expected sizes to be non-negative, but got -1 iree-org/iree#19501 7 hf_swin_base_patch4_window7_224.ms_in22k_ft_in1k, hf_swin-tiny-patch4-window7-224, hf_yolos-base, hf_yolos-fashionpedia, hf_yolos-small, hf_yolos-small-finetuned-license-plate-detection, hf_yolos-small-rego-plates-detection
17 compilation error: 'stream.async.dispatch' op has invalid Read access range iree-org/iree#20155 1 hf_dpt-large-ade
18 compilation error: 'iree_linalg_ext.pack' op write affecting operations on global resources are restricted to workgroup distributed contexts. iree-org/iree#20156 1 hf_distilhubert
19 compilation error: expected offsets to be non-negative, but got -1 iree-org/iree#19935 1 hf_pnasnet5large.tf_in1k
23 native_inference [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: pixel_values for the following indices #941 1 hf_mobilenet_v1_0.75_192
24 native_inference [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Add node #942 1 hf_eva_large_patch14_196.in22k_ft_in22k_in1k
26 compiled_inference :0: FAILED_PRECONDITION; onnx.Expand input has a dim that is not statically 1 #944 2 hf_phobert-base-finetuned, hf_phobert-large-finetuned

Following issues resolved:

# Issue type Issue Message Issue no #Model impacted List of model Assignee Status
2 setup requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url 919 3 hf_Multiple_Choice, hf_multiple_choice_model, hf_Multiple_Choice_EN @amd-vivekag Fixed in PR: nod-ai/SHARK-TestSuite#456
4 setup Unknown task: fill-mask 921 2 hf_multi-qa-mpnet-base-cos-v1, hf_all-mpnet-base-v1 @amd-vivekag Fixed in PR: nod-ai/SHARK-TestSuite#456
9 import_model Killed due to OOM #926 1 hf_StableBeluga2 @amd-vivekag Fixed in PR: nod-ai/SHARK-TestSuite#451
10 import_model assertNonNull: Assertion g.get() != nullptr failed #927 5 hf_esm2_t36_3B_UR50D, hf_Phi-3.5-mini-instruct, hf_Phi-3-mini-128k-instruct, hf_Phi-3-mini-4k-instruct, hf_zephyr-7b-beta @amd-vivekag Fixed in PR: nod-ai/SHARK-TestSuite#451
11 import_model assertInVersionRange: Assertion version >= version_range.first && version <= version_range.second failed #928 8 hf_llama-7b, hf_oasst-sft-4-pythia-12b-epoch-3.5, hf_Qwen2.5-1.5B-Instruct, hf_Qwen2.5-7B-Instruct, hf_Qwen2-7B-Instruct, hf_TinyLlama-1.1B-Chat-v1.0, hf_vicuna-7b-v1.5, hf_wasmai-7b-v1 @amd-vivekag Fixed in PR: nod-ai/SHARK-TestSuite#451
20 construct_inputs ValueError: Asking to pad but the tokenizer does not have a padding token #938 4 hf_distilgpt2, hf_gpt2, hf_llama-68m, hf_tiny-random-mistral @amd-vivekag Fixed in PR: nod-ai/SHARK-TestSuite#451
21 construct_inputs name 'tokens' is not defined #939 1 hf_wavlm-base-plus @amd-vivekag Fixed in PR: nod-ai/SHARK-TestSuite#442
22 native_inference IndexError: tuple index out of range #940 14 hf_bart-base, hf_gpt2-small-spanish, hf_ivila-row-layoutlm-finetuned-s2vl-v2, hf_opt-125m, hf_Qwen1.5-0.5B-Chat, hf_Qwen2-0.5B, hf_Qwen2.5-0.5B-Instruct, hf_really-tiny-falcon-testing, hf_tiny-dummy-qwen2, hf_tiny-Qwen2ForCausalLM-2.5, hf_tiny-random-GemmaForCausalLM, hf_tiny-random-LlamaForCausalLM, hf_tiny-random-mt5, hf_tiny-random-Phi3ForCausalLM @amd-vivekag Fixed in PR: nod-ai/SHARK-TestSuite#447
25 compiled_inference INVALID_ARGUMENT; function expected fewer input values; parsing input input.bin #943 4 hf_ko-sroberta-multitask, hf_robertuito-sentiment-analysis, hf_sbert_large_nlu_ru, hf_sentence-bert-base-ja-mean-tokens-v2 @amd-vivekag Fixed in PR: nod-ai/SHARK-TestSuite#453

@zjgarvey
Copy link
Collaborator

zjgarvey commented Feb 13, 2025

I assume the most recent run is on CPU? Can you share the detail table in a gist? Can you also post the IREE version?

@amd-vivekag
Copy link

I assume the most recent run is on CPU? Can you share the detail table in a gist? Can you also post the IREE version?

Yes, these are run on CPU. I was getting more failures (around 40 more failures on GPU). I'm using following IREE version:

IREE (https://iree.dev):
  IREE compiler version 3.2.0rc20250206 @ f3bef2de123f08b4fc3b0ce691494891bd6760d0
  LLVM version 20.0.0git
  Optimized build

Following is the detailed table link:
https://gist.github.com/amd-vivekag/377a7b141b40c118f880b2ced176f95c

@pdhirajkumarprasad
Copy link
Author

Here is latest status on HF model https://gist.github.com/pdhirajkumarprasad/784eee989d6935d1074c217de2040477 we should focus on 6/7 issues that mentioned on this.

@amd-vivekag, please list the issue number for the issue mentioned on above page

@zjgarvey we need to focus on these and let's try to get clean by next week so that we are in good shape w.r.t HF models

@vmnit
Copy link

vmnit commented Mar 25, 2025

Latest failure summary:
Following is the latest number of HF failures:

     66 compilation
      2 compiled_inference
      1 import_model
      4 native_inference
     36 Numerics
      7 setup

Here,

  • 2 extra native_inference failures are coming from onnxruntime segmentation fault with latest release 1.21.0 (passing with 1.20.1). Issue created microsoft/onnxruntime#24144
  • 5 extra compilation failures (2 failing since 3.3.0rc20250319 and 3 failing since 3.3.0rc20250312)

Failing since iree-base-compiler v3.3.0rc20250319 (git commit: iree-org/iree@fba3a7c )

hf_detr-layout-detection
hf_detr-resnet-50-panoptic

Issue: iree-org/iree#20379

Failing since iree-base-compiler v3.3.0rc20250312 (PR: iree-org/iree#20159)

hf_table-transformer-detection
hf_table-transformer-detection-custom-ale
hf_vit_base_patch32_224.augreg_in21k_ft_in1k

Issue: iree-org/iree#20277

@amd-vivekag
Copy link

Passing Summary

TOTAL TESTS = 541

Stage # Passing % of Total % of Attempted
Setup 534 98.7% 98.7%
IREE Compilation 467 86.3% 87.5%
Gold Inference 465 86.0% 99.6%
IREE Inference Invocation 463 85.6% 99.6%
Inference Comparison (PASS) 427 78.9% 92.2%

Fail Summary

TOTAL TESTS = 541

Stage # Failed at Stage % of Total
Setup 7 1.3%
IREE Compilation 67 12.4%
Gold Inference 2 0.4%
IREE Inference Invocation 2 0.4%
Inference Comparison 36 6.7%

Test Run Detail

Test was run with the following arguments:
Namespace(device='local-task', backend='llvm-cpu', target_chip='x86_64-linux-gnu', iree_compile_args=None, mode='cl-onnx-iree', torchtolinalg=False, stages=None, skip_stages=None, benchmark=False, load_inputs=False, groups='all', test_filter=None, testsfile='hf_tests.txt', tolerance=None, verbose=True, rundirectory='test-run', no_artifacts=False, cleanup='0', report=True, report_file='reports/hf_all_tests_543.md', get_metadata=True)

Test Exit Status Mean Benchmark Time (ms) Notes
hf_1_microsoft_deberta_V1.0 compilation None
hf_1_microsoft_deberta_V1.1 compilation None
hf_bart-large-mnli Numerics None
hf_beit-base-patch16-224-pt22k compilation None
hf_beit-base-patch16-224-pt22k-ft22k compilation None
hf_checkpoints_10_1_microsoft_deberta_V1.1_384 compilation None
hf_checkpoints_1_16 compilation None
hf_checkpoints_26_9_microsoft_deberta_21_9 compilation None
hf_checkpoints_28_9_microsoft_deberta_V2 compilation None
hf_checkpoints_28_9_microsoft_deberta_V4 compilation None
hf_checkpoints_28_9_microsoft_deberta_V5 compilation None
hf_checkpoints_29_9_microsoft_deberta_V1 compilation None
hf_checkpoints_30_9_microsoft_deberta_V1.0_384 compilation None
hf_checkpoints_3_14 compilation None
hf_content compilation None
hf_deberta-base compilation None
hf_deberta-large-mnli compilation None
hf_deberta-v2-base-japanese compilation None
hf_deberta-v2-base-japanese-char-wwm compilation None
hf_deberta-v3-base compilation None
hf_deberta-v3-base-absa-v1.1 compilation None
hf_deberta-v3-base-injection compilation None
hf_DeBERTa-v3-base-mnli-fever-anli compilation None
hf_deberta-v3-base-squad2 compilation None
hf_deberta-v3-base-zeroshot-v1.1-all-33 compilation None
hf_deberta-v3-base_finetuned_ai4privacy_v2 compilation None
hf_deberta-v3-large compilation None
hf_deberta-v3-large-squad2 compilation None
hf_deberta-v3-large_boolq compilation None
hf_deberta-v3-large_test compilation None
hf_deberta-v3-large_test_9e-6 compilation None
hf_deberta-v3-small compilation None
hf_deberta-v3-xsmall compilation None
hf_deberta_finetuned_pii compilation None
hf_Debertalarg_model_multichoice_Version2 compilation None
hf_deeplabv3-mobilevit-small compilation None
hf_deeplabv3-mobilevit-xx-small compilation None
hf_densenet121.ra_in1k Numerics None
hf_detr-doc-table-detection Numerics None
hf_detr-layout-detection compilation None
hf_detr-resnet-101 Numerics None
hf_detr-resnet-101-dc5 Numerics None
hf_detr-resnet-50 Numerics None
hf_detr-resnet-50-dc5 Numerics None
hf_detr-resnet-50-finetuned-10k-cppe5 Numerics None
hf_detr-resnet-50-panoptic compilation None
hf_detr-resnet-50-sku110k Numerics None
hf_diagram_detr_r50_finetuned Numerics None
hf_distilhubert compilation None
hf_ditr-e15 Numerics None
hf_dpt-large-ade compilation None
hf_ese_vovnet19b_dw.ra_in1k Numerics None
hf_eva_large_patch14_196.in22k_ft_in22k_in1k native_inference None
hf_fine-tuned-MoritzLaurer-deberta-v3-large-zeroshot-v2.0-arceasy setup None
hf_inception_resnet_v2.tf_in1k Numerics None
hf_inception_v3.tf_adv_in1k Numerics None
hf_inception_v3.tv_in1k Numerics None
hf_Llama3-8B-1.58-100B-tokens-GGUF setup None
hf_llm-mdeberta-v3-swag compilation None
hf_mdeberta-v3-base compilation None
hf_mDeBERTa-v3-base-mnli-xnli compilation None
hf_mdeberta-v3-base-squad2 compilation None
hf_mDeBERTa-v3-xnli-ft-bs-multiple-choice compilation None
hf_Medical-NER compilation None
hf_Meta-Llama-3.1-8B-Instruct-AWQ-INT4 setup None
hf_Meta-Llama-3.1-8B-Instruct-bnb-4bit setup None
hf_Midnight-Miqu-70B-v1.5-4bit setup None
hf_Mistral-7B-Instruct-v0.2-GPTQ setup None
hf_mobilenet_v1_0.75_192 native_inference None
hf_mobilevit-small compilation None
hf_mxbai-rerank-base-v1 compilation None
hf_mxbai-rerank-xsmall-v1 compilation None
hf_nfnet_l0.ra2_in1k import_model None
hf_nli-deberta-v3-base compilation None
hf_output compilation None
hf_pedestrian_gender_recognition compilation None
hf_phobert-base-finetuned compiled_inference None
hf_phobert-large-finetuned compiled_inference None
hf_piiranha-v1-detect-personal-information compilation None
hf_pix2text-table-rec Numerics None
hf_pnasnet5large.tf_in1k compilation None
hf_Qwen2.5-1.5B-Instruct Numerics None
hf_resnet-18 Numerics None
hf_resnet-50 Numerics None
hf_resnet101.a1h_in1k Numerics None
hf_resnet18.a1_in1k Numerics None
hf_resnet34.a1_in1k Numerics None
hf_resnet50.a1_in1k Numerics None
hf_resnext50_32x4d.fb_swsl_ig1b_ft_in1k Numerics None
hf_ruRoPEBert-e5-base-2k setup None
hf_splinter-base compilation None
hf_splinter-base-qass compilation None
hf_swin-tiny-patch4-window7-224 compilation None
hf_swin_base_patch4_window7_224.ms_in22k_ft_in1k compilation None
hf_table-transformer-detection compilation None
hf_table-transformer-detection-custom-ale compilation None
hf_table-transformer-structure-recognition Numerics None
hf_table-transformer-structure-recognition-v1.1-all Numerics None
hf_table-transformer-structure-recognition-v1.1-pub Numerics None
hf_tf_efficientnet_b0.ns_jft_in1k Numerics None
hf_tf_efficientnetv2_s.in21k Numerics None
hf_tf_mobilenetv3_large_minimal_100.in1k Numerics None
hf_tf_mobilenetv3_small_minimal_100.in1k Numerics None
hf_vgg16.tv_in1k Numerics None
hf_vgg19.tv_in1k Numerics None
hf_vit_base_patch32_224.augreg_in21k_ft_in1k compilation None
hf_wavlm-base-plus Numerics None
hf_wide_resnet50_2.racm_in1k Numerics None
hf_xcit_tiny_24_p8_384.fb_dist_in1k Numerics None
hf_yolos-base compilation None
hf_yolos-fashionpedia compilation None
hf_yolos-small compilation None
hf_yolos-small-finetuned-license-plate-detection compilation None
hf_yolos-small-rego-plates-detection compilation None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants