Skip to content

Recording from SpikeSourceArray hangs under MPI #836

@antolikjan

Description

@antolikjan

PyNN/NEST MPI hang when retrieving recorded spikes from SpikeSourceArray

Summary

Population.get_data(["spikes"], clear=True) appears to hang under MPI when the recorded population is a PyNN/NEST SpikeSourceArray.

The same script runs in serial. Under mpirun -np 2, both ranks reach the line immediately before get_data(), then block indefinitely.

Minimal reproducer

from mpi4py import MPI
from pyNN import nest as sim

import nest
import pyNN


def nest_version():
    if hasattr(nest, "version"):
        return nest.version()
    return getattr(nest, "__version__", "unknown")


def rank_print(*values):
    print("rank", MPI.COMM_WORLD.rank, *values, flush=True)


sim.setup(timestep=0.1, min_delay=0.1, max_delay=5.0, threads=1)

src = sim.Population(
    4,
    sim.SpikeSourceArray(spike_times=[0.5]),
    label="recorded_spike_source_array",
)
src.record("spikes")

rank_print("versions", "pyNN=%s" % pyNN.__version__, "nest=%s" % nest_version())
rank_print("before run")
sim.run(1.0)
rank_print("before get_data")

block = src.get_data(["spikes"], clear=True)

rank_print("after get_data", "segments=%d" % len(block.segments))
sim.end()

Command

timeout 45 mpirun -np 2 python spikesourcearray_get_data_mpi_hang.py

Observed output

The script prints:

rank 0 versions pyNN=0.11.0 nest=3.4
rank 1 versions pyNN=0.11.0 nest=3.4
rank 0 before run
rank 1 before run
rank 1 before get_data
rank 0 before get_data

and then blocks until killed by timeout, which exits with code 124.

Expected behavior

Both ranks should complete the get_data() call and print after get_data.

Additional checks

These related cases do not hang under MPI in the same environment:

  1. A SpikeSourceArray population exists but is not recorded.
  2. A SpikeSourceArray population is connected to a downstream IF_cond_exp population, but only the downstream population is recorded.
  3. Reading the spike_times parameter with src.get("spike_times") completes.

The hang appears specific to recording/retrieving spikes from the SpikeSourceArray population itself.

Environment

Observed with:

PyNN 0.11.0
NEST 3.4
MPI via mpirun -np 2

The reproducer prints the exact pyNN.__version__ and NEST version values at runtime.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions