Skip to content

BitGenerator support #499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open

BitGenerator support #499

wants to merge 43 commits into from

Conversation

flying-sheep
Copy link

@flying-sheep flying-sheep commented Jun 6, 2025

See

Fixes #498

The idea is to have a safe wrapper around the npy_bitgen struct that implements rand::RngCore. That way pyo3 functions could be passed a np.random.Generator, get that wrapper from it, and pass it to Rust APIs, which could then call its methods repeatedly.

The way it’s implemented, the workflow would look like this:

  1. acquire GIL
  2. downcast a np.random.BitGenerator instance into a numpy::random::PyBitGenerator.
  3. call .lock() on it to get a numpy::random::PyBitGeneratorGuard.
  4. release GIL
  5. call functions on guard object without needing to hold the GIL

TODO:

  • I see local crashes when running all tests, so there’s probably some UB, I’d appreciate help to fix it.

Safety

If somebody releases the threading lock of the BitGenerator while we’re using it, this isn’t safe 🤔

API design options

I could make this more complex by adding a new trait that is implemented by both PyBitGenerator and PyBitGeneratorGuard, allowing to choose if someone wants to

  • use the PyBitGenerator’s random_* methods directly on that object while holding the GIL and without locking it
  • use it like it’s used now, by locking the np.random.BitGenerator and returning a GIL-free object that can be used.

but for now I just implemented the use case that’s actually desired.

@flying-sheep flying-sheep changed the title BItGenerator support BitGenerator support Jun 6, 2025
@flying-sheep flying-sheep marked this pull request as ready for review June 8, 2025 12:44
Copy link
Contributor

@Icxolu Icxolu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a useful addition! Thanks for working on it. I'm definitely not an expert here, but I left a few comment about things that stood out to me. Let me know what you think.
Also, are there any differences between numpy v1 and v2 that we need to consider?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will do when I’m done. I like working on multiple machines, and I don’t like re-doing settings for individual projects

Copy link
Contributor

@Icxolu Icxolu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't love the drop impl, but with the way to manually release it with a Python token, it may be acceptable. Maybe @davidhewitt has an idea and/or comments about the appoach. Otherwise I only have a few minor remarks.

@flying-sheep
Copy link
Author

flying-sheep commented Jun 9, 2025

Thanks for the comments, I’ll address them!

The main issue is that I think I’m triggering UB somehow and I don’t know how: when running all tests, often some unrelated test run after this one crashes …

Also, are there any differences between numpy v1 and v2 that we need to consider?

I didn’t forget about this either, will look!

/edit: the C API for random is there since 1.19: https://numpy.org/doc/1.26/reference/random/c-api.html

//! # use pyo3::prelude::*;
//! use rand::Rng as _;
//! # use numpy::random::{PyBitGenerator, PyBitGeneratorMethods as _};
//! # // TODO: reuse function definition from above?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like there should be a convenient way to get this. I'm thinking about something like

impl PyBitGenerator {
     fn new(py: Python<'_>) -> PyResult<Bound<..>>;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are many implementations, we’d have to cover all of them.

I’d rather leave this minimal until this PR is mostly done.

.getattr(intern!(py, "capsule"))?
.downcast_into::<PyCapsule>()?;
let lock = self.getattr(intern!(py, "lock"))?;
// we’re holding the GIL, so there’s no race condition checking the lock and acquiring it later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be true under free-threaded Python. Is the lock known to be threadsafe and acquire simply fails if the lock is already acquired? If not we may need to guard the whole module under cfg(not(Py_GIL_DISABLED))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn’t fail, it hangs, but that’s configurable with a timeout or by making it non-blocking: https://docs.python.org/3/library/threading.html#threading.Lock.acquire

and it’s a threading.Lock!

@flying-sheep
Copy link
Author

OK, with the release attr and changing the parallel test to use the explicit release as well, the UB now sometimes manifests as a lock poisoning error. progress?

@Icxolu
Copy link
Contributor

Icxolu commented Jun 10, 2025

I may have found a problem:

This fails as intended:

Python::with_gil(|py| {
    let obj = get_bit_generator(py)?;
    let a = obj.lock()?;
    let b = obj.lock()?;

    Ok::<_, PyErr>(())
})
.unwrap();

returning

called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'RuntimeError'>, value: RuntimeError('BitGenerator is already locked'), traceback: None }

But this does not fail:

Python::with_gil(|py| {
    let a = get_bit_generator(py)?.lock()?;
    let b = get_bit_generator(py)?.lock()?;

    Ok::<_, PyErr>(())
})
.unwrap();

and crucially it gives the same pointers:

[src/random.rs:113:18] ptr = 0x00007b9f6be44cc0
[src/random.rs:113:18] *ptr = bitgen_t {
    state: 0x00007b9f6be44d08,
    next_uint64: 0x00007b9f6837d320,
    next_uint32: 0x00007b9f6837d370,
    next_double: 0x00007b9f6837d3f0,
    next_raw: 0x00007b9f6837d320,
}
[src/random.rs:113:18] ptr = 0x00007b9f6be44cc0
[src/random.rs:113:18] *ptr = bitgen_t {
    state: 0x00007b9f6be44d08,
    next_uint64: 0x00007b9f6837d320,
    next_uint32: 0x00007b9f6837d370,
    next_double: 0x00007b9f6837d3f0,
    next_raw: 0x00007b9f6837d320,
}

So when using multiple threads, for example multiple tests running in parallel, we have a data race on the state. I think we need a lock across all instances to make this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rand::RngCore implementation for numpy.random.Generator
3 participants