You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A test example of poor convergence for mclmc without tempering is shown in the final section of this notebook. There are posterior samples from 5 different chains shown:
results from dynesty nested sampling (no initialization needed)
results from NUTS/HMC as implemented in numpyro, initialized from final state of the dynesty samples
results from mclmc initialized from final state of the dynesty samples, denoted mchmcd
results from mclmc initialized from final state of the numpyro samples, denoted mchmcn
results from mclmc initialized from all parameters equal 0 (essentially a random point), denoted mchmc0
There's a few things to note:
I'm working from a constrained parameter space and I'd like to have uniform priors on some range. I found that NUTS/HMC was better behaved with "hard" priors where the likelihood went to -inf outside of a given range, but mclmc gave nans with this setup, so I gave it "smooth" priors, which leads to some slight disagreements for parameters that are prior dominated as seen in output of cell 35 in the notebook. (If I have time, I'd like to spend some more time making sure that there are no prior-dominated parameters, but I haven't gotten to this yet, so in the meantime this hard vs soft implementation of priors is why there are some discrepancies on a few posteriors)
The main takeaway from that plot is that the mchmcd and mchmcn posteriors are very similar whereas the mchcm0 posteriors (which are initialized from a bad point) are quite different and spend quite a lot of time far away from the other posteriors (even though I throw out half of the samples as burn-in)
In the following cell you can see that the mchmcd and mchmcn chains explore rather similar log-probability values, differing by DLnP = 15, but the mchmc0 maximum log probability is lower by about 250 (or Dchi^2 = +500 if 2*DLnP is chi^2 distributed). The numpyro and dynesty results are even lower than that in LnP, but because of the different priors I don't think that's a fair comparison.
This doesn't seem to be a bug to me, but @reubenharry suggested I submit an issue since this demonstrates a difference in performance between tempered and non-tempered results. I'm happy to run the mchmc0 chains with different specifications and different approaches to annealing/tempering if it's useful, just let me know
The text was updated successfully, but these errors were encountered:
A test example of poor convergence for
mclmc
without tempering is shown in the final section of this notebook. There are posterior samples from 5 different chains shown:dynesty
nested sampling (no initialization needed)numpyro
, initialized from final state of thedynesty
samplesmclmc
initialized from final state of thedynesty
samples, denotedmchmcd
mclmc
initialized from final state of thenumpyro
samples, denotedmchmcn
mclmc
initialized from all parameters equal 0 (essentially a random point), denotedmchmc0
There's a few things to note:
mclmc
gavenan
s with this setup, so I gave it "smooth" priors, which leads to some slight disagreements for parameters that are prior dominated as seen in output of cell 35 in the notebook. (If I have time, I'd like to spend some more time making sure that there are no prior-dominated parameters, but I haven't gotten to this yet, so in the meantime this hard vs soft implementation of priors is why there are some discrepancies on a few posteriors)mchmcd
andmchmcn
posteriors are very similar whereas themchcm0
posteriors (which are initialized from a bad point) are quite different and spend quite a lot of time far away from the other posteriors (even though I throw out half of the samples as burn-in)mchmcd
andmchmcn
chains explore rather similar log-probability values, differing by DLnP = 15, but themchmc0
maximum log probability is lower by about 250 (or Dchi^2 = +500 if 2*DLnP is chi^2 distributed). Thenumpyro
anddynesty
results are even lower than that in LnP, but because of the different priors I don't think that's a fair comparison.This doesn't seem to be a bug to me, but @reubenharry suggested I submit an issue since this demonstrates a difference in performance between tempered and non-tempered results. I'm happy to run the
mchmc0
chains with different specifications and different approaches to annealing/tempering if it's useful, just let me knowThe text was updated successfully, but these errors were encountered: