- 
        Couldn't load subscription status. 
- Fork 109
start merging early thread exit handling #1584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
| @maddyscientist As for simplifying the calls to  | 
| @jcosborn for those QUDA developers not on the portability calls (who are reviewing this PR) can you describe why these changes are needed? | 
| The early thread exit handling changes are to support programming models (like SYCL) which only support block collectives when all threads in a block are active (non-exited).  The changes allow targets to have all threads enter a kernel functor to participate in the block collectives when block collectives are used in a kernel.  For these kernels, instead of exiting when a thread is determined to be out-of-bounds for the kernel, all threads can enter the kernel, and the out-of-bounds ones will be marked inactive with an extra argument.  Only kernels that need this handling need any changes.  The kernel functors that need this handling have an extra template parameter  | 
This covers all files except the dslashes. Those will be merged later to keep the complexity of this PR down.