Skip to content

Pick better default for mp_context #4422

@ecobost

Description

@ecobost

#4333 changes how mp_context is chosen from:

if pool_engine == "process":
if mp_context is None:
mp_context = recording.get_preferred_mp_context()
if mp_context is not None and platform.system() == "Windows":
assert mp_context != "fork", "'fork' mp_context not supported on Windows!"
elif mp_context == "fork" and platform.system() == "Darwin":
warnings.warn('As of Python 3.8 "fork" is no longer considered safe on macOS')
self.mp_context = mp_context

to

if pool_engine == "process":
if mp_context is None:
# auto choice
if platform.system() == "Windows":
mp_context = "spawn"
elif platform.system() == "Linux":
mp_context = "fork"
elif platform.system() == "Darwin":
# We used to force spawn for macos, this is sad but in some cases fork in macos
# is very unstable and lead to crashes.
mp_context = "spawn"
else:
mp_context = "spawn"
preferred_mp_context = recording.get_preferred_mp_context()
if preferred_mp_context is not None and preferred_mp_context != mp_context:
warnings.warn(
f"Your processing chain using pool_engine='process' and mp_context='{mp_context}' is not possible."
f"So use mp_context='{preferred_mp_context}' instead"
)
mp_context = preferred_mp_context
self.mp_context = mp_context

Before the PR; job_kwargs.mp_context=None and recording.get_preferred_mp_context()=None (the default and most common setup) will call mp_context = multiprocessing.get_context(None) that returned the default context for that specific machine (spawn for windows & Mac and fork/forkserver for Linux). Now, the behavior has changed for Linux: it always picks fork. Default mp_context for Linux from python 3.14 onwards is forkserver (a thread-safe, less problematic alternative to fork) docs.

When running kilosort4 in python 3.14. fork did not play nice with some openmp multi-threaded calls in kilosort, particularly the process died silently somewhere within these lines:
https://github.com/MouseLand/Kilosort/blob/7a19a57bef39f3e07ab7fa1edf5aa8635c69a850/kilosort/spikedetect.py#L206-221
This was run inside a kubernetes pod with GPUs but I couldn't replicate it locally: 3.13 and 3.14 both worked with fork or forkserver :/ so not sure if it will be a problem for other people.

In any case, it might be worth considering making forkserver (rather than fork) the Linux default (it's been around since python 3.4) or allow multiprocessing to pick the best default based on system and python version as it did before this PR, i.e., let mp_context=None with a comment saying what that means).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions