check_call(...stdout=subprocess.DEVNULL)) hides the error output

### Describe the bug

Hi,
When the triton is compiling the kernel, it will silently fail without any error message. The root cause is that this line redirects `stdout` to the null.

https://github.com/intel/intel-xpu-backend-for-triton/blob/5c00f43383c4ce3eb0a0fbda5a67208bc6a04107/python/triton/runtime/build.py#L98

So the full error message is like below:

```
F:\pytorch> python benchmarks/dynamo/timm_models.py --accuracy --float16 -d xpu -n10 --training --only fbnetv3_b --backend=inductor 
loading model: 0it [00:16, ?it/s]
xpu  train fbnetv3_b
ERROR:common:
Traceback (most recent call last):
  File "F:\pytorch\benchmarks\dynamo\common.py", line 2242, in check_accuracy
    new_result = self.run_n_iterations(
...

  File "F:\miniforge\envs\nightly\lib\site-packages\torch\utils\_triton.py", line 111, in triton_hash_with_backend
    backend = triton_backend()
  File "F:\miniforge\envs\nightly\lib\site-packages\torch\utils\_triton.py", line 103, in triton_backend
    target = driver.active.get_current_target()
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\backends\intel\driver.py", line 741, in get_current_target
    device = self.get_current_device()
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\backends\intel\driver.py", line 733, in get_current_device
    return self.utils.get_current_device()
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\backends\intel\driver.py", line 727, in __getattr__
    self.utils = XPUUtils()
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\backends\intel\driver.py", line 304, in __init__
    self.mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "spirv_utils")
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\backends\intel\driver.py", line 269, in compile_module_from_src
    so = _build(name, src_path, tmpdir, COMPILATION_HELPER.library_dir, COMPILATION_HELPER.include_dir,
  File "F:\miniforge\envs\nightly\lib\site-packages\triton\runtime\build.py", line 98, in _build
    subprocess.check_call(cc_cmd, stdout=subprocess.DEVNULL)
  File "F:\miniforge\envs\nightly\lib\subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
torch._inductor.exc.InductorError: CalledProcessError: Command '['C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.43.34808\\bin\\HostX64\\x64\\cl.EXE', '/Zc:__cplusplus', '/std:c++17', 'C:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq\\main.cpp', '/nologo', '/O2', '/LD', '/wd4996', '/MD', '/EHsc', '/I/usr/local\\include', '/IF:\\miniforge\\envs\\nightly\\Library\\include\\sycl', '/IF:\\miniforge\\envs\\nightly\\Library\\include', '/IF:\\miniforge\\envs\\nightly\\Lib\\site-packages\\triton\\backends\\intel\\include', '/IC:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq', '/IF:\\miniforge\\envs\\nightly\\Include', '/INone', '/IF:\\miniforge\\envs\\nightly\\lib\\site-packages\\numpy\\_core\\include', '/FoC:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq\\main.obj', '/link', '/OUT:C:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq\\spirv_utils.cp310-win_amd64.pyd', '/IMPLIB:C:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq\\main.lib', '/PDB:C:\\Users\\tongsu\\AppData\\Local\\Temp\\tmpvzfhocyq\\main.pdb', '/LIBPATH:F:\\miniforge\\envs\\nightly\\Library\\bin', '/LIBPATH:F:\\miniforge\\envs\\nightly\\Library\\lib', '/LIBPATH:/usr/local\\lib', '/LIBPATH:F:\\miniforge\\envs\\nightly\\Lib\\site-packages\\triton\\backends\\intel\\lib', '/LIBPATH:F:\\miniforge\\envs\\nightly\\libs', 'ze_loader.lib', 'sycl8.lib', '/LIBPATH:F:\\miniforge\\envs\\nightly\\Library\\bin', '/LIBPATH:F:\\miniforge\\envs\\nightly\\Library\\lib']' returned non-zero exit status 2.

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

TorchDynamo optimized model failed to run because of following error
fail_to_run
```

If I delete the redirect of `stdout=subprocess.DEVNULL`, I could actually get the error message:

```
> python benchmarks/dynamo/timm_models.py --accuracy --float16 -d xpu -n10 --training --only fbnetv3_b --backend=inductor 
loading model: 0it [00:16, ?it/s]
xpu  train fbnetv3_b

main.cpp
C:\Users\tongsu\AppData\Local\Temp\tmpsumw5rn3\main.cpp(9): fatal error C1083: Cannot open include file: 'cstddef': No such file or directory

ERROR:common:
Traceback (most recent call last):
```

So I would personally suggest that we use `run(..., check=True)` and print everything if there is error (Or maybe pack the errors and raise it). 

### Environment details

-

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

check_call(...stdout=subprocess.DEVNULL)) hides the error output #3760

Describe the bug

Environment details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

check_call(...stdout=subprocess.DEVNULL)) hides the error output #3760

Description

Describe the bug

Environment details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions