We expect to get performance boost from switching the current non-blocking MPI communication routines to one-sided communication. In addition, this would fix the host-based memory leak in MPICH (isend/irecv leak memory). I'll open this issue to explore this idea.
Changes required would be:
- mpi4cpp library does not have one-sided communication routine bindings, and so these need to be first added to the library.
- Runko
send_data/recv_data functions in tiles have isend and irecv functions that need to be modified.
- Easiest switch might be to use the
Rput/Rget implementation of one-sided communciation that (similar to non-blocking) return requests. We can then check for message completion with the same mpi_wait functionality as before.
- MPI windows need to be opened in the beginning of the communications.
- This could be easily done in the beginning (e.g., in a prelude step) for
E and B arrays.
J array is more complicated because filtering uses implicit std::swap to change swap this array with temporary allocation.
- Similarly,
ParticleContainers can be re-allocated, and some functionality to dynamically open/close their windows are needed.
- MPI fences need to be set up for the communicators.
- As an MVP, each rank could set these for comm_world but this might result in large overhead.
- A better implementation would use corgi to sniff out the neighboring ranks and split local communicators for each rank with their unique fences.
We expect to get performance boost from switching the current non-blocking MPI communication routines to one-sided communication. In addition, this would fix the host-based memory leak in MPICH (isend/irecv leak memory). I'll open this issue to explore this idea.
Changes required would be:
send_data/recv_datafunctions in tiles haveisendandirecvfunctions that need to be modified.Rput/Rgetimplementation of one-sided communciation that (similar to non-blocking) return requests. We can then check for message completion with the samempi_waitfunctionality as before.EandBarrays.Jarray is more complicated because filtering uses implicitstd::swapto change swap this array with temporary allocation.ParticleContainers can be re-allocated, and some functionality to dynamically open/close their windows are needed.