Chunk all the CUDA calls #14

tbenthompson · 2021-06-14T19:06:47Z

Currently, the screen will freeze if a user runs a large cutde calculation on the GPU that also drives their monitors. This is avoided by chunking the calculation in disp_blocks and disp_aca. Chunking allows a few processes to run in between the kernels and makes the screen much more responsive.

Ideally this chunking would be configurable since it seems to generally slow the computation by 5-10%.

The text was updated successfully, but these errors were encountered:

tbenthompson · 2021-08-17T23:39:40Z

Alternatively, I suspect some asynchronous data transfer running simultaneously with the kernel might help make the chunking as fast as the unchunked version. The synchronous chunked version has lots of blocking operations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chunk all the CUDA calls #14

Chunk all the CUDA calls #14

tbenthompson commented Jun 14, 2021

tbenthompson commented Aug 17, 2021

Chunk all the CUDA calls #14

Chunk all the CUDA calls #14

Comments

tbenthompson commented Jun 14, 2021

tbenthompson commented Aug 17, 2021