Need MPI Chunking to Support Large K Values #4

esaliya · 2019-12-29T21:12:58Z

During the generation of sequence to k-mers matrix, A, each process keeps track of unique k-mers local to it. In order to generate S from these, processes have to communicate these to figure out the global set of unique k-mers.

The way this is done in the code is as follows.

Every process creates a boolean array of the size |Alph|^k, where |Alph| is the size of the alphabet. For proteins, it's 25^k.
This array serves as the process-local unique k-mer ID list.
Once, each process has found its list of k-mers, it participates in an MPI_Allreduction using MPI_LOR.
This results in a globally unique k-mer list.

Currently, the code relies on a single MPI_Allreduce, which limits the size of the boolean array to be less than 2³¹. This works for k=6 with proteins but will fail for anything above that for proteins as the alphabet size is 25.

The solution to this would be to use multiple MPI_Allreduce calls over parts of the boolean array.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need MPI Chunking to Support Large K Values #4

Need MPI Chunking to Support Large K Values #4

esaliya commented Dec 29, 2019

Need MPI Chunking to Support Large K Values #4

Need MPI Chunking to Support Large K Values #4

Comments

esaliya commented Dec 29, 2019