-
Notifications
You must be signed in to change notification settings - Fork 45
Support for complex numbers #262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the example. It would NOT be difficult support it programmatically but what I don't quite understand is how HMC would work on the complex domain, which I need to look into a bit. On the practical side, does this (or any other) inference problem with complex numbers has a closed form solution, so that we can also check if the inference goes correct? |
Unfortunately, I am not very familiar with these models yet. So I don't know if there are any models with closed-form solution. I will ask my colleague next week to see if he has any ideas. I wonder if it would be possible to modify a simple Binomial model so that it is in a complex domain. |
The reason the above model does not work has nothing to do with AdvancedHMC.jl. AdvancedHMC does indeed require that all parameters be real, but your only parameter, The issue here will be two-fold. First, ForwardDiff has only partial support for complex numbers, so while it might work for some models, expect it to fail for others. In this case, the issue is that your model includes the matrix exponential, whose signature is constrained to nested task error: MethodError: no method matching exp(::Matrix{Complex{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#1"{DynamicPPL.TypedVarInfo{NamedTuple{(:θ,), Tuple{DynamicPPL.Metadata{Dict{AbstractPPL.VarName{:θ, Tuple{}}, Int64}, Vector{Truncated{Normal{Float64}, Continuous, Float64}}, Vector{AbstractPPL.VarName{:θ, Tuple{}}}, Vector{Float64}, Vector{Set{DynamicPPL.Selector}}}}}, Float64}, DynamicPPL.Model{var"#1#2", (:data, :S, :n_sim), (), (), Tuple{Tuple{Vector{Int64}, Vector{Int64}}, Vector{Float64}, Int64}, Tuple{}}, DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40}, (), AdvancedHMC.DiagEuclideanMetric}}, DynamicPPL.DefaultContext}, Float64}, Float64, 1}}})
Closest candidates are:
exp(::StridedMatrix{var"#s832"} where var"#s832"<:Union{Float32, Float64, ComplexF32, ComplexF64}) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/dense.jl:557
exp(::StridedMatrix{var"#s832"} where var"#s832"<:Union{Integer, Complex{var"#s831"} where var"#s831"<:Integer}) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/dense.jl:558
exp(::StaticArrays.StaticMatrix{N, M, T} where {N, M, T}) at /Users/saxen/.julia/packages/StaticArrays/rdb0l/src/expm.jl:1 So any AD that uses type overloading will fail, unless they have added a special rule for using Zygote
Turing.setadbackend(:zygote) to your example. But one thing to keep in mind is that Zygote does not support code that mutates array values, so you would need to either modify your implementation of On the initial issue of supporting HMC with complex parameters, I don't think this is a good fit for this package. Most models will have a mixture of real and complex parameters, which could not fit well in an array of uniform eltype. Then there's the question of how to handle the metric, whose dimension would not match the number of degrees of freedom. No distributions in Distributions.jl support complex random variables, and the complex distributions I know of can all be trivially constructed in terms of a real distribution. It's straightforward to define complex parameters in terms of real ones, so unless we have a good example where this just won't work, I recommend this package keep to sampling real-valued parameters. |
Thanks for the input here @sethaxen.
I agree with you on this issues, which I didn't realise earlier.
Agreed. Seems a bit too involved to have this properly done. Also as you suggested, the original problem from @itsdfish is not really sampling on complex domain. |
Out-of-the-box I would not expect it to work. Since the imaginary components of the real parameters can vary without any change to the log probability, they make the target distribution improper. But e.g. a real diagonal metric would have the real and imaginary parts sharing the same diagonal entry of the metric, so one would be adapting that entry of the metric to two two parameters: one of infinite scale and one of finite scale, which would not go well. To get any reasonable performance, one would have to either indicate which imaginary parts should be zero and constrain them to be zero, or augment the logpdf to place a prior on the imaginary parts, which would not solve the shared scale issue. |
I guess you could reparameterise theta to get rid of this periodicity.
You can try https://fluxml.ai/Zygote.jl/latest/utils/#Zygote.Buffer to do mutation in Zygote. |
Thank you both for your help. I'll go ahead and close this issue. |
No problem!
That's right! Thankfully, as @xukai92 pointed out, this can be fixed with a simple reparameterization. Something like this: @model model(data, S, n_sim) = begin
x ~ Normal(0, 1)
y ~ Normal(0, 1)
θ = atan(y, x)
data ~ Quantum(θ, S, n_sim)
end would create @model model(data, S, n_sim) = begin
x ~ Normal(0, 1)
y ~ Normal(0, 1)
θ = atan(y, x)
Turing.@addlogprob! logpdf(VonMises(2, 0.25), θ)
data ~ Quantum(θ, S, n_sim)
end This works for reasons outlined in the Stan manual. You just need to be careful that the distribution you apply to
As I understand, there are, but not soon. This is apparently a hard problem that needs an engineer's focus to solve, and I don't completely understand it. For reference, some other AD systems like JAX have the same limitations. Other reverse-mode AD's in Julia like ReverseDiff and Tracker don't, but they also don't support complex numbers well or at all, I think. |
@sethaxen, this is really helpful. I didn't even think about Understandable about Zygote. From what I can tell, developing AD software is very challenging. It is certainly a stress test for the language. I hope that one day one of the packages can approach the performance of Stan without many sacrifices to flexibility. I have one minor question: why does Edit: is it because theta is deterministic/ dependent on x and y? |
I haven't seen a direct comparison, so I actually don't know how Stan's AD compares to Julia's various AD packages.
More or less. mylogpdf(d::VonMises, x) = d.κ * (cos(x - d.μ) - 1) - log(d.I0κx) - log2π
@model model(data, S, n_sim) = begin
x ~ Normal(0, 1)
y ~ Normal(0, 1)
θ = atan(y, x)
Turing.@addlogprob! mylogpdf(VonMises(2, 0.25), θ)
data ~ Quantum(θ, S, n_sim)
end I strongly encourage you to check that this works correctly if you use it. e.g. by dropping the likelihood and drawing random samples from the prior, then draw random samples from |
Thank you @sethaxen. I will be sure to validate your recommendations. Regarding AD performance, I have done some comparisons with |
Uh oh!
There was an error while loading. Please reload this page.
Hi @xukai92,
As discussed on Slack, adding support for complex numbers would allow us to use many models from physics. The code below is a quantum model of human judgment based on Wang et al. (2014). The model has a single parameter theta, which rotates the basis vectors. Thank you for looking into this. This feature would be very useful for me and others as well.
The text was updated successfully, but these errors were encountered: