Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option for user-provided random.Random instance for parallel reproducibility #75

Open
wmayner opened this issue May 4, 2015 · 4 comments
Milestone

Comments

@wmayner
Copy link

wmayner commented May 4, 2015

I have a use-case that may apply to others and warrant a new feature. Often, one wants to be able to reproduce the results of an evolution by supplying the same random seed. With the current implementation, this precludes running multiple instances of DEAP in parallel, since the randomness is supplied by the single instance of random.Random obtained from import random.

It would be great if there were an option to supply a user-created random.Random instance so multiple evolutions could be run in parallel without their RNGs sharing state, allowing for reproducibility.

@cmd-ntrf
Copy link
Member

cmd-ntrf commented May 5, 2015

We have juggled with this sort of idea at the very beginning of DEAP, maybe it is time to revisit, but I have two questions first.

1- Why is an instance of random.Random required and simply setting a different seed for each instance using random.seed is insufficient?
2- How would the Random instance be provided? As a global variable? A parameter of each operator?

Complementary question: how do other frameworks (i.e.: scikit-learn) handle this?

@cmd-ntrf cmd-ntrf added this to the 2.0 milestone May 5, 2015
@wmayner
Copy link
Author

wmayner commented May 5, 2015

  1. I believe that just using random.seed is sufficient as long as long as there are totally separate instances of Python running. To orchestrate the parallelization from within a single instance of Python, however, would require separate random.Random objects to be seeded, since all direct calls to the random module use its singleton Random instance. This would be an enhancement to make it possible to run several evolutions in parallel from a single Python script with e.g. multiprocessing or joblib.
  2. The Random instance could be provided as an optional parameter to the functions that make random calls, or it could be registered (to a toolbox?) in some way.

It seems that SciPy's approach is the first option; there is an optional parameter that accepts an instance of numpy.random.RandomState (similar to random.Random) and defaults to the singleton numpy.random. See e.g. scipy.sparse.rand.

@Ogaday
Copy link
Contributor

Ogaday commented Feb 23, 2016

For a further encouragement to implement this, read the comments on this answer. Apparently Robert Kern is one of the NumPy contributors. I also think that Being able to register a random number generator fits in with the toolbox approach of DEAP.

@fmder
Copy link
Member

fmder commented Jul 4, 2019

This is linked to #349.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants