Skip to content

Incorrect results produced by warp shuffles in gpgpu-sim #230

Open
@ueqri

Description

@ueqri

Hello everyone,

I am writing this issue to ask if you could give me some suggestions about how to solve the inaccurate result produced by warp primitives running in gpgpu-sim.

Code snippet: The minimal code comes from the official tutorial of CUDA: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#warp-examples-broadcast.

#include <stdio.h>

__global__ void bcast(int arg) {
    int laneId = threadIdx.x & 0x1f;
    int value;
    if (laneId == 0)        // Note unused variable for
        value = arg;        // all threads except lane 0
    value = __shfl_sync(0xffffffff, value, 0);   // Synchronize all threads in warp, and get "value" from lane 0
    if (value != arg)
        printf("Thread %d failed.\n", threadIdx.x);
}

int main() {
    bcast<<< 1, 32 >>>(1234);
    cudaDeviceSynchronize();

    return 0;
}

Build environment: I used is the image jonghyun1215/gpgpu:gpgpusim4 from docker hub, with GCC 7.5, gpgpu-sim 4.0.0(commit ID:90ec33997, exactly the latest commit in branch dev), CUDA 10.1.

Situation: The sample code shouldn't print the failed message and it was tested in a real GPU environment which got the expected results. But when running it through gpgpu-sim, no matter using performance simulation or functional simulation, the results are wrong. 😕

Investigation: I turned to other warp samples like shfl_down_sync, shfl_xor_sync in that tutorial, the correctness error still exists. For comparison, I also wrote a simple reduction using two methods separately, 1) shared memory, 2) warp shuffle, the result of shared memory is exactly correct, but warp shuffle is not, which confused me a lot. Thus, I guess there are some bugs in the implementations of warp primitives in gpgpu-sim.

Possible Parts: To locate the relevant part in the gpgpu-sim codebase, I searched for the shfl PTX operator, and found the implementations here: link. Not very experienced in the codes of gpgpu-sim, I've been blocked in these step for few days.

I would appreciate it sincerely if you could help me with this trouble. Thanks for your consideration! ☺️

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions