Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make NVRTC work with Wave headers #11500

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

oerling
Copy link
Contributor

@oerling oerling commented Nov 11, 2024

NVRTC requires headers to be extracted into text strings and passed to the compiler. This is done by the jitify utility added from Nvidia, added to external. Extracting the headers takes a long time. This is therefore done on first use and the headers are subsequently kept as strings. NVRTC further requires some header substitutions that are provided by jitify. jitify also provides defaults for --gpu-architecture for NVRTC.

Even with jitify, NVRTC fails when including <cuda/atomic> and cub at the same time. We depend on <cuda/atomic> and <cuda/semaphore> and extract warp scan from cub as jit/Scan.cuh. Other cub items may be similarly extracted if needed.

All Wave headers that may be needed by NVRTC are extracted into jit/Headers.h. A make_headers.sh script is added for regenerating Headers.h from actual headers. We add the stringify utility from NVIDIA to make escapes when changing headers to string literals.

We add tests that verify use of <cuda/semaphore> for updating aggregates where the updating kernel is compiled at run time. We also add a test for the warp scan replacement in jit/.

@facebook-github-bot facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. oncall: jit labels Nov 11, 2024
Copy link

netlify bot commented Nov 11, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 274e4fc
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/67366f23990f5600081105ee

@facebook-github-bot
Copy link
Contributor

@oerling has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@oerling has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@oerling has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D65760296

oerling pushed a commit to oerling/velox-1 that referenced this pull request Nov 13, 2024
Summary:
NVRTC requires headers to be extracted into text strings and passed to the compiler. This is done by the jitify utility added from Nvidia, added to external. Extracting the headers takes a long time. This is therefore done on first use and the headers are subsequently kept as strings. NVRTC further requires some header substitutions that are provided by jitify. jitify also provides defaults for --gpu-architecture for NVRTC.

Even with jitify, NVRTC fails when including <cuda/atomic> and cub at the same time. We depend on <cuda/atomic> and <cuda/semaphore> and extract warp scan from cub as jit/Scan.cuh. Other cub items may be similarly extracted if needed.

All Wave headers that may be needed by NVRTC are extracted into jit/Headers.h. A make_headers.sh script is added for regenerating Headers.h from actual headers. We add the stringify utility from NVIDIA to make escapes when changing headers to string literals.

We add tests that verify use of <cuda/semaphore> for updating aggregates where the updating kernel is compiled at run time. We also add a test for the warp scan replacement in jit/.

Pull Request resolved: facebookincubator#11500

Differential Revision: D65760296

Pulled By: oerling
Summary:
NVRTC requires headers to be extracted into text strings and passed to the compiler. This is done by the jitify utility added from Nvidia, added to external. Extracting the headers takes a long time. This is therefore done on first use and the headers are subsequently kept as strings. NVRTC further requires some header substitutions that are provided by jitify. jitify also provides defaults for --gpu-architecture for NVRTC.

Even with jitify, NVRTC fails when including <cuda/atomic> and cub at the same time. We depend on <cuda/atomic> and <cuda/semaphore> and extract warp scan from cub as jit/Scan.cuh. Other cub items may be similarly extracted if needed.

All Wave headers that may be needed by NVRTC are extracted into jit/Headers.h. A make_headers.sh script is added for regenerating Headers.h from actual headers. We add the stringify utility from NVIDIA to make escapes when changing headers to string literals.

We add tests that verify use of <cuda/semaphore> for updating aggregates where the updating kernel is compiled at run time. We also add a test for the warp scan replacement in jit/.

Pull Request resolved: facebookincubator#11500

Differential Revision: D65760296

Pulled By: oerling
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D65760296

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported oncall: jit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants