Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_call(...stdout=subprocess.DEVNULL)) hides the error output #3760

Open
Stonepia opened this issue Mar 26, 2025 · 0 comments
Open

check_call(...stdout=subprocess.DEVNULL)) hides the error output #3760

Stonepia opened this issue Mar 26, 2025 · 0 comments

Comments

@Stonepia
Copy link
Contributor

Stonepia commented Mar 26, 2025

Describe the bug

Hi,
I found that when I forget to set the ZE_PATH, when the triton is compiling the kernel, it will silently fail without any error message. The root cause is that this line redirects stdout to the null.

subprocess.check_call(cc_cmd, stdout=subprocess.DEVNULL)

So the full error message is like below:

F:\pytorch> python benchmarks/dynamo/timm_models.py --accuracy --float16 -d xpu -n10 --training --only fbnetv3_b --backend=inductor 
loading model: 0it [00:16, ?it/s]
xpu  train fbnetv3_b
ERROR:common:
Traceback (most recent call last):
  File "F:\pytorch\benchmarks\dynamo\common.py", line 2242, in check_accuracy
    new_result = self.run_n_iterations(
...

  File "F:\miniforge\envs\nightly\lib\site-packages\torch\utils\_triton.py", line 111, in triton_hash_with_backend
    backend = triton_backend()
  File "F:\miniforge\envs\nightly\lib\site-packages\torch\utils\_triton.py", line 103, in triton_backend
    target = driver.active.get_current_target()
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\backends\intel\driver.py", line 741, in get_current_target
    device = self.get_current_device()
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\backends\intel\driver.py", line 733, in get_current_device
    return self.utils.get_current_device()
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\backends\intel\driver.py", line 727, in __getattr__
    self.utils = XPUUtils()
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\backends\intel\driver.py", line 304, in __init__
    self.mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "spirv_utils")
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\backends\intel\driver.py", line 269, in compile_module_from_src
    so = _build(name, src_path, tmpdir, COMPILATION_HELPER.library_dir, COMPILATION_HELPER.include_dir,
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\runtime\build.py", line 98, in _build
    subprocess.check_call(cc_cmd, stdout=subprocess.DEVNULL)
  File "F:\miniforge\envs\nightly\lib\subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
torch._inductor.exc.InductorError: CalledProcessError: Command '['C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.43.34808\\bin\\HostX64\\x64\\cl.EXE', '/Zc:__cplusplus', '/std:c++17', 'C:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq\\main.cpp', '/nologo', '/O2', '/LD', '/wd4996', '/MD', '/EHsc', '/I/usr/local\\include', '/IF:\\miniforge\\envs\\nightly\\Library\\include\\sycl', '/IF:\\miniforge\\envs\\nightly\\Library\\include', '/IF:\\miniforge\\envs\\nightly\\Lib\\site-packages\\triton\\backends\\intel\\include', '/IC:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq', '/IF:\\miniforge\\envs\\nightly\\Include', '/INone', '/IF:\\miniforge\\envs\\nightly\\lib\\site-packages\\numpy\\_core\\include', '/FoC:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq\\main.obj', '/link', '/OUT:C:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq\\spirv_utils.cp310-win_amd64.pyd', '/IMPLIB:C:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq\\main.lib', '/PDB:C:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq\\main.pdb', '/LIBPATH:F:\\miniforge\\envs\\nightly\\Library\\bin', '/LIBPATH:F:\\miniforge\\envs\\nightly\\Library\\lib', '/LIBPATH:/usr/local\\lib', '/LIBPATH:F:\\miniforge\\envs\\nightly\\Lib\\site-packages\\triton\\backends\\intel\\lib', '/LIBPATH:F:\\miniforge\\envs\\nightly\\libs', 'ze_loader.lib', 'sycl8.lib', '/LIBPATH:F:\\miniforge\\envs\\nightly\\Library\\bin', '/LIBPATH:F:\\miniforge\\envs\\nightly\\Library\\lib']' returned non-zero exit status 2.

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

TorchDynamo optimized model failed to run because of following error
fail_to_run

If I delete the redirect of stdout=subprocess.DEVNULL, I could actually get the error message:

> python benchmarks/dynamo/timm_models.py --accuracy --float16 -d xpu -n10 --training --only fbnetv3_b --backend=inductor 
loading model: 0it [00:16, ?it/s]
xpu  train fbnetv3_b

main.cpp
F:\miniforge\envs\nightly\Lib\site-packages\triton\backends\intel\include\sycl_functions.h(15): fatal error C1083: Cannot open include file: 'level_zero/ze_api.h': No such file or directory

ERROR:common:
Traceback (most recent call last):

So I would personally suggest that we use run(..., check=True) and print everything if there is error (Or maybe pack the errors and raise it).

Environment details

@Stonepia Stonepia changed the title check_call() output is hidden check_call(...stdout=subprocess.DEVNULL)) hides the error output Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants