segmentation fault while using onnxruntime==1.21.0 #24144

vmnit · 2025-03-24T17:06:11Z

onnxruntime getting crashed with segmentation fault when using version 1.21.0. It is not crashing when using 1.20.1 release.

Steps to reproduce:

import onnxruntime as ort
sess_options = ort.SessionOptions()
sess = ort.InferenceSession('hf_Qwen2-7B-Instruct_model.onnx', sess_options)

hf_Qwen2-7B-Instruct_model.onnx.gz

The text was updated successfully, but these errors were encountered:

yuslepukhin · 2025-03-24T17:41:37Z

The model is referring to an external weights file. Would you like to supply it?

yuslepukhin · 2025-03-24T17:49:24Z

here is the exception message that is issued. I am not seeing a segmentation fault on the main build:

unknown file: error: C++ exception with description "Load model from D :/dev/data/SegmentationFault_gh_24144/hf_Qwen2 - 7B - Instruct_model.onnx failed:Load model D :/dev/data/SegmentationFault_gh_24144/hf_Qwen2 - 7B - Instruct_model.onnx failed" thrown in the test body.

vmnit · 2025-03-25T04:45:27Z

Hi @yuslepukhin,

Thanks for looking into it.
The data file is huge around 29 GB. Can you please suggest a way to share the data file?

vmnit · 2025-03-25T12:52:52Z

Hi @yuslepukhin ,

I'm using model from the following location: https://huggingface.co/Qwen/Qwen2-7B-Instruct/tree/main
Can you please try generating onnx_model from there because I'm unable to find a way to upload the big data file?

yuslepukhin · 2025-03-25T16:41:35Z

Please, share exactly what you did.

Also, please, share any console messages, enable logging and share, specifically what makes you think there is a segmentation fault. Also, please, fill out the template as to the version of your Linux OS etc.

amd-vivekag · 2025-03-26T17:10:12Z

@yuslepukhin I'm trying to create a script which can reproduce the issue at your end. I'll try to share with you soon.

vmnit · 2025-03-27T06:15:17Z

Steps to reproduce:

Create virtual environment: python -m venv myenv.env
Activate it: source myenv.env/bin/activate
Pip upgrade: pip install --upgrade pip
Install some libraries: pip install onnx optimum[exporters] onnxruntime
Set CACHE_DIR: export CACHE_DIR=<SOME_PATH>
run script: python test_seg_fault.py

# test_seg_fault.py
import os
from optimum.exporters.onnx import main_export

cache_dir = os.environ["CACHE_DIR"]
os.environ["HF_HOME"] = cache_dir
os.environ["HUGGINGFACE_HUB_CACHE"] = cache_dir

main_export(
        "Qwen/Qwen2-7B-Instruct",
        os.getcwd(),
        task='text-generation',
        cache_dir=cache_dir,
        local_files_only=False,
        monolith=True,
        framework="pt",
        optimize=None,
        )

import onnxruntime as ort
sess_options = ort.SessionOptions()

print("before ort.InferenceSession")
sess = ort.InferenceSession('model.onnx', sess_options)

print(sess)

Please let me know if you need any information from my side in this regard.

Thanks

yuslepukhin · 2025-03-27T22:04:04Z

I have followed the procedure and got the model. Produced a debug build from the tip of main.
Tried from a C++ test and your python script simply loading the model.
I did not have a repro. At one-point physical memory usage clocked 31Gb and total commit was 44 Gb, so it had its share of page faults, but the process completed normally. The closest release is next month.

vmnit · 2025-03-28T04:45:17Z

Hi @yuslepukhin,

Were you able to run the complete script without any segmentation fault? If yes, can you please check the onnxruntime version?
I'm able to reproduce with following library versions:

Successfully installed MarkupSafe-3.0.2 certifi-2025.1.31 charset-normalizer-3.4.1 coloredlogs-15.0.1 filelock-3.18.0 flatbuffers-25.2.10 fsspec-2025.3.0 huggingface-hub-0.29.3 humanfriendly-10.0 idna-3.10 jinja2-3.1.6 mpmath-1.3.0 networkx-3.4.2 numpy-2.2.4 nvidia-
cublas-cu12-12.4.5.8 nvidia-cuda-cupti-cu12-12.4.127 nvidia-cuda-nvrtc-cu12-12.4.127 nvidia-cuda-runtime-cu12-12.4.127 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.2.1.3 nvidia-curand-cu12-10.3.5.147 nvidia-cusolver-cu12-11.6.1.9 nvidia-cusparse-cu12-12.3.1.170 n
vidia-cusparselt-cu12-0.6.2 nvidia-nccl-cu12-2.21.5 nvidia-nvjitlink-cu12-12.4.127 nvidia-nvtx-cu12-12.4.127 onnx-1.17.0 onnxruntime-1.21.0 optimum-1.24.0 packaging-24.2 pillow-11.1.0 protobuf-6.30.2 pyyaml-6.0.2 regex-2024.11.6 requests-2.32.3 safetensors-0.5.3 sym
py-1.13.1 timm-1.0.15 tokenizers-0.21.1 torch-2.6.0 torchvision-0.21.0 tqdm-4.67.1 transformers-4.48.3 triton-3.2.0 typing-extensions-4.13.0 urllib3-2.3.0

While running inference step, I was getting Seg Fault: sess = ort.InferenceSession('model.onnx', sess_options)
I was not getting print after that. But if you are getting valid sess object then it is working for you, it seems.

yuslepukhin · 2025-03-28T17:57:17Z

The bug reproduces with 1.21.0, but is not there with the latest code.

D:\dev\data\SegmentationFault_gh_24144$ pip list
Package Version

certifi 2025.1.31
charset-normalizer 3.4.1
colorama 0.4.6
coloredlogs 15.0.1
filelock 3.18.0
flatbuffers 25.2.10
fsspec 2025.3.0
huggingface-hub 0.29.3
humanfriendly 10.0
idna 3.10
Jinja2 3.1.6
MarkupSafe 3.0.2
mpmath 1.3.0
networkx 3.4.2
numpy 1.24.3
onnx 1.17.0
onnxruntime 1.22.0
optimum 1.24.0
packaging 24.2
pip 25.0.1
protobuf 6.30.2
pyreadline3 3.5.4
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
safetensors 0.5.3
setuptools 65.5.0
sympy 1.13.1
tokenizers 0.21.1
torch 2.6.0
tqdm 4.67.1
transformers 4.50.2
typing_extensions 4.13.0
urllib3 2.3.0

vmnit · 2025-03-29T11:10:34Z

The bug reproduces with 1.21.0, but is not there with the latest code.

D:\dev\data\SegmentationFault_gh_24144$ pip list Package Version

certifi 2025.1.31 charset-normalizer 3.4.1 colorama 0.4.6 coloredlogs 15.0.1 filelock 3.18.0 flatbuffers 25.2.10 fsspec 2025.3.0 huggingface-hub 0.29.3 humanfriendly 10.0 idna 3.10 Jinja2 3.1.6 MarkupSafe 3.0.2 mpmath 1.3.0 networkx 3.4.2 numpy 1.24.3 onnx 1.17.0 onnxruntime 1.22.0 optimum 1.24.0 packaging 24.2 pip 25.0.1 protobuf 6.30.2 pyreadline3 3.5.4 PyYAML 6.0.2 regex 2024.11.6 requests 2.32.3 safetensors 0.5.3 setuptools 65.5.0 sympy 1.13.1 tokenizers 0.21.1 torch 2.6.0 tqdm 4.67.1 transformers 4.50.2 typing_extensions 4.13.0 urllib3 2.3.0

@yuslepukhin It is great that you are able to reproduce the issue. I think we should add this as a testcase to avoid such a regression in the future. What do you say? Let me know if you want me to add it. If yes, can you please share some documentation or guide me on how to add it and verify the same.

yuslepukhin added the core runtime label Mar 24, 2025

vmnit mentioned this issue Mar 27, 2025

HF model tracker nod-ai/SHARK-ModelDev#899

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

segmentation fault while using onnxruntime==1.21.0 #24144

segmentation fault while using onnxruntime==1.21.0 #24144

vmnit commented Mar 24, 2025 •

edited

Loading

yuslepukhin commented Mar 24, 2025

yuslepukhin commented Mar 24, 2025

vmnit commented Mar 25, 2025

vmnit commented Mar 25, 2025

yuslepukhin commented Mar 25, 2025

amd-vivekag commented Mar 26, 2025

vmnit commented Mar 27, 2025

yuslepukhin commented Mar 27, 2025 •

edited

Loading

vmnit commented Mar 28, 2025

yuslepukhin commented Mar 28, 2025 •

edited

Loading

vmnit commented Mar 29, 2025 •

edited

Loading

segmentation fault while using onnxruntime==1.21.0 #24144

segmentation fault while using onnxruntime==1.21.0 #24144

Comments

vmnit commented Mar 24, 2025 • edited Loading

yuslepukhin commented Mar 24, 2025

yuslepukhin commented Mar 24, 2025

vmnit commented Mar 25, 2025

vmnit commented Mar 25, 2025

yuslepukhin commented Mar 25, 2025

amd-vivekag commented Mar 26, 2025

vmnit commented Mar 27, 2025

yuslepukhin commented Mar 27, 2025 • edited Loading

vmnit commented Mar 28, 2025

yuslepukhin commented Mar 28, 2025 • edited Loading

vmnit commented Mar 29, 2025 • edited Loading

vmnit commented Mar 24, 2025 •

edited

Loading

yuslepukhin commented Mar 27, 2025 •

edited

Loading

yuslepukhin commented Mar 28, 2025 •

edited

Loading

vmnit commented Mar 29, 2025 •

edited

Loading