Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix shuffle bug in CodeGen C. #8567

Merged
merged 7 commits into from
Feb 18, 2025
Merged

Conversation

mcourteaux
Copy link
Contributor

Reopened this PR from a different branch, as I wasn't expecting this would take a month to merge, so I moved it to a new branch (instead of my main branch).


Shuffle emitting in OpenCL was broken when the input to the Shuffle node were actual vectors instead of scalar. For some reason, in most of the scenarios the codegen makes it's way to the Shuffle nodes, the Shuffles are containing all vectors with 1 lane, causing no real issue without this PR. Today I ran into codegen having to shuffle multiple actual vectors in the OpenCL codegen.

@derek-gerstmann I had to disable Vulkan testing, because there is an issue on my machine with that. Could you check that out? Last excerpt from DEBUG_CODEGEN=1 output:

Skipping Hexagon offload...
Offloading GPU loops...
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x8 vectors=2 is_interleave=false is_extract_element=false
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x4 vectors=4 is_interleave=false is_extract_element=false
    vector shuffle x4 : 0 1 2 3 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x4 vectors=4 is_interleave=false is_extract_element=false
    vector shuffle x4 : 0 1 2 3 
    vector shuffle x2 : 3 1 6 7 2 4 0 5 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x4 vectors=1 is_interleave=false is_extract_element=false
    vector shuffle x1 : 0 1 2 3 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x4 vectors=1 is_interleave=false is_extract_element=false
    vector shuffle x1 : 4 5 6 7 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x2 vectors=1 is_interleave=false is_extract_element=false
    vector shuffle x1 : 0 1 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32x2 vectors=1 is_interleave=false is_extract_element=false
    vector shuffle x1 : 2 3 
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32 vectors=1 is_interleave=false is_extract_element=true
 CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(Shuffle): type=int32 vectors=1 is_interleave=false is_extract_element=true
Vulkan: Using static workgroup local size [8, 1, 1]...
  kernel_count = 1
  spirv_module_size[0] = 2432 bytes
Lowering Parallel Tasks...
Embedding image vulkan_buf
Embedding image vulkan_gpu_source_kernels
Target triple of initial module: x86_64--linux-gnu
Generating llvm bitcode...
Generating llvm bitcode prolog for function g...
Generating llvm bitcode for function g...
JIT compiling g for x86-64-linux-tune_znver1-avx-avx2-f16c-fma-jit-sse41-user_context-vk_v13-vulkan
[New Thread 0x7fffed4006c0 (LWP 434073)]
[New Thread 0x7fffeca006c0 (LWP 434074)]
[New Thread 0x7fffe5c006c0 (LWP 434075)]
[New Thread 0x7fffe40006c0 (LWP 434077)]
[New Thread 0x7fffe36006c0 (LWP 434079)]
NVVM compilation failed: 1
Vulkan [WARNING]: (user_context=0x7fffffffc510, id=2, name:NVIDIA) CreatePipeline: failed to compile internal representation
Vulkan [WARNING]: (user_context=0x7fffffffc510, id=2, name:NVIDIA) CreatePipeline: unexpected compilation failure
Vulkan [WARNING]: (user_context=0x7fffffffc510, id=2, name:NVIDIA) CreateComputePipeline: unexpected failure compiling SPIR-V shader: 0x9c4841a2cbb3db9d
User error triggered at /home/martijn/zec/3rd/halide/src/JITModule.cpp:1232
Error: Vulkan: Failed to create compute pipeline! vkCreateComputePipelines returned <Unknown Vulkan Result Code>
Vulkan: Failed to create compute pipeline!
Vulkan: Failed to setup compute pipeline!

@abadams
Copy link
Member

abadams commented Feb 14, 2025

I think @shoaibkamil has been snowed under with work. Perhaps @derek-gerstmann could take a look?

@mcourteaux mcourteaux requested review from halidebuildbots and removed request for halidebuildbots February 14, 2025 12:53
@derek-gerstmann derek-gerstmann self-requested a review February 14, 2025 13:45
@derek-gerstmann
Copy link
Contributor

@abadams Sure, I’ll have a look. I’ll need some time to diagnose the Vulkan codegen issues, but I’ll review what’s been submitted so far.

@derek-gerstmann
Copy link
Contributor

What’s the status on all the other backends? Cuda, D3D12, etc? Does this only affect GPU C-Source based code paths?

@mcourteaux
Copy link
Contributor Author

mcourteaux commented Feb 14, 2025

What’s the status on all the other backends? Cuda, D3D12, etc? Does this only affect GPU C-Source based code paths?

Yes, this only affected the C-based backends (OpenCL, D3D12, WebGPU). The PTX codegen is based on the LLVM codegen-backend, which doesn't have this issue. The Vulkan backend works via SPIRV, which I opened the issue for in #8580.

@mcourteaux
Copy link
Contributor Author

@derek-gerstmann Can you update the review? I'd like to start clearing my PRs.

@derek-gerstmann
Copy link
Contributor

@derek-gerstmann Can you update the review? I'd like to start clearing my PRs.

Sure! LGTM. Could you add an issue for the Vulkan CodeGen so I can track it? Thx!

@mcourteaux mcourteaux merged commit 2e36da4 into halide:main Feb 18, 2025
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants