TL;DR — A Pyneapple plugin that provides GPU-accelerated curve fitting via a custom build of Gpufit. It registers two solvers (
gpufit_curvefit,gpufit_nnls) into Pyneapple's entry-point system and bundles the native CUDA library inside the wheel — no separate toolkit installation required.
- Python ≥ 3.9
- An NVIDIA GPU with a CUDA-compatible driver
- Pyneapple
# pip
pip install git+https://github.com/darksim33/pyneapple-gpufit.git
# uv
uv add git+https://github.com/darksim33/pyneapple-gpufitThe wheel includes Gpufit.dll (Windows) and libGpufit.so (Linux). Only an up-to-date NVIDIA driver is needed — no CUDA toolkit. Currently the supplied libraries are only compiled for x86 systems. For other systems custom compiles are needed. For detailed instructions on how to compile see.
from pyneapple.models.biexp import BiExpModel
from pyneapple_gpufit import GpuCurveFitSolver
import numpy as np
b_values = np.array([0, 25, 50, 75, 100, 150, 200, 300, 400, 600, 800, 1000, 1200])
model = BiExpModel(fit_reduced=True) # 3-parameter IVIM: f1, D1, D2
solver = GpuCurveFitSolver(
model=model,
p0={"f1": 0.2, "D1": 0.01, "D2": 0.001},
bounds={"f1": (0.0, 1.0), "D1": (1e-4, 0.1), "D2": (1e-5, 0.01)},
)
# ydata shape: (n_pixels, n_b_values) or (n_b_values,) for a single voxel
solver.fit(b_values, ydata)
params = solver.get_params() # {"f1": ..., "D1": ..., "D2": ...}
diag = solver.get_diagnostics() # {"states": ..., "chi_squares": ..., ...}[Fitting]
model = "BiExp"
fit_reduced = true
[Fitting.solver]
type = "gpufit_curvefit"
max_iter = 500
tol = 1e-4
[Fitting.solver.p0]
f1 = 0.2
D1 = 0.010
D2 = 0.001
[Fitting.solver.bounds]
f1 = [0.0, 1.0 ]
D1 = [1e-4, 0.1 ]
D2 = [1e-5, 0.01]| Pyneapple model | fit_reduced |
fit_s0 |
GPU kernel | Parameters |
|---|---|---|---|---|
MonoExpModel |
— | — | MONOEXP |
S0, D |
BiExpModel |
True (default) |
False |
BIEXP_RED |
f1, D1, D2 |
BiExpModel |
False |
False |
BIEXP |
f1, D1, f2, D2 |
BiExpModel |
— | True |
BIEXP_S0 |
f1, D1, D2, S0 |
TriExpModel |
True (default) |
False |
TRIEXP_RED |
f1, D1, f2, D2, D3 |
TriExpModel |
False |
False |
TRIEXP |
f1, D1, f2, D2, f3, D3 |
TriExpModel |
— | True |
TRIEXP_S0 |
f1, D1, f2, D2, D3, S0 |
pyneapple v2.0 note: For
fit_s0=Truemodels the GPU kernel placesS0last. If you pass ndarray-stylep0orboundstofit(), column order must match the kernel parameter order above. Dict-style inputs are looked up by key and are unaffected.
Models with T1 correction (fit_t1=True) are not currently supported — use Pyneapple's CPU CurveFitSolver instead.
The GPU fitting engine is based on Gpufit. If you use pyneapple-gpufit in published work, cite:
Przybylski, A., Throm, B., Kaderali, L. & Grüll, H.
Gpufit: An open-source toolkit for GPU-accelerated curve fitting.
Scientific Reports 7, 15722 (2017).
https://doi.org/10.1038/s41598-017-15313-9
The CUDA kernels used by this plugin are adapted from darksim33/GPUfit, a fork of the upstream Gpufit library that extends it with diffusion MRI models (BIEXP_RED, TRIEXP_RED, MONOEXP_RED, and their T1/S0 correction variants).
# clone and install in editable mode (requires uv)
git clone https://github.com/darksim33/pyneapple-gpufit
cd pyneapple-gpufit
uv sync --dev
uv run pytest testsSee docs/ for the full API reference and implementation notes.
GPL-3.0-or-later — see LICENSE for details.