Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk all the CUDA calls #14

Open
tbenthompson opened this issue Jun 14, 2021 · 1 comment
Open

Chunk all the CUDA calls #14

tbenthompson opened this issue Jun 14, 2021 · 1 comment

Comments

@tbenthompson
Copy link
Owner

Currently, the screen will freeze if a user runs a large cutde calculation on the GPU that also drives their monitors. This is avoided by chunking the calculation in disp_blocks and disp_aca. Chunking allows a few processes to run in between the kernels and makes the screen much more responsive.

Ideally this chunking would be configurable since it seems to generally slow the computation by 5-10%.

@tbenthompson
Copy link
Owner Author

Alternatively, I suspect some asynchronous data transfer running simultaneously with the kernel might help make the chunking as fast as the unchunked version. The synchronous chunked version has lots of blocking operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant