PyNN/NEST MPI hang when retrieving recorded spikes from SpikeSourceArray
Summary
Population.get_data(["spikes"], clear=True) appears to hang under MPI when the recorded population is a PyNN/NEST SpikeSourceArray.
The same script runs in serial. Under mpirun -np 2, both ranks reach the line immediately before get_data(), then block indefinitely.
Minimal reproducer
from mpi4py import MPI
from pyNN import nest as sim
import nest
import pyNN
def nest_version():
if hasattr(nest, "version"):
return nest.version()
return getattr(nest, "__version__", "unknown")
def rank_print(*values):
print("rank", MPI.COMM_WORLD.rank, *values, flush=True)
sim.setup(timestep=0.1, min_delay=0.1, max_delay=5.0, threads=1)
src = sim.Population(
4,
sim.SpikeSourceArray(spike_times=[0.5]),
label="recorded_spike_source_array",
)
src.record("spikes")
rank_print("versions", "pyNN=%s" % pyNN.__version__, "nest=%s" % nest_version())
rank_print("before run")
sim.run(1.0)
rank_print("before get_data")
block = src.get_data(["spikes"], clear=True)
rank_print("after get_data", "segments=%d" % len(block.segments))
sim.end()
Command
timeout 45 mpirun -np 2 python spikesourcearray_get_data_mpi_hang.py
Observed output
The script prints:
rank 0 versions pyNN=0.11.0 nest=3.4
rank 1 versions pyNN=0.11.0 nest=3.4
rank 0 before run
rank 1 before run
rank 1 before get_data
rank 0 before get_data
and then blocks until killed by timeout, which exits with code 124.
Expected behavior
Both ranks should complete the get_data() call and print after get_data.
Additional checks
These related cases do not hang under MPI in the same environment:
- A
SpikeSourceArray population exists but is not recorded.
- A
SpikeSourceArray population is connected to a downstream IF_cond_exp population, but only the downstream population is recorded.
- Reading the
spike_times parameter with src.get("spike_times") completes.
The hang appears specific to recording/retrieving spikes from the SpikeSourceArray population itself.
Environment
Observed with:
PyNN 0.11.0
NEST 3.4
MPI via mpirun -np 2
The reproducer prints the exact pyNN.__version__ and NEST version values at runtime.
PyNN/NEST MPI hang when retrieving recorded spikes from SpikeSourceArray
Summary
Population.get_data(["spikes"], clear=True)appears to hang under MPI when the recorded population is a PyNN/NESTSpikeSourceArray.The same script runs in serial. Under
mpirun -np 2, both ranks reach the line immediately beforeget_data(), then block indefinitely.Minimal reproducer
Command
Observed output
The script prints:
and then blocks until killed by
timeout, which exits with code124.Expected behavior
Both ranks should complete the
get_data()call and printafter get_data.Additional checks
These related cases do not hang under MPI in the same environment:
SpikeSourceArraypopulation exists but is not recorded.SpikeSourceArraypopulation is connected to a downstreamIF_cond_exppopulation, but only the downstream population is recorded.spike_timesparameter withsrc.get("spike_times")completes.The hang appears specific to recording/retrieving spikes from the
SpikeSourceArraypopulation itself.Environment
Observed with:
The reproducer prints the exact
pyNN.__version__and NEST version values at runtime.