Skip to content

[WIP] add SGHMC, SGLD trajectories #113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

sivapvarma
Copy link

@sivapvarma sivapvarma commented Oct 13, 2019

Just placeholders for now, still getting a feeling for how things are organized in AHMC, at the same time wanted to get started.

Goal is to fix Issue #60 .

Comments are welcome.

h::Hamiltonian,
z::PhasePoint
) where {T<:Real}
z′ = step(rng, τ.integrator, h, z, τ.n_steps)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably only need to change this line to implmenet SGHMC.

h::Hamiltonian,
z::PhasePoint
) where {T<:Real}
z′ = step(rng, τ.integrator, h, z, τ.n_steps)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably only need to change this line to implmenet SGLD.

@xukai92
Copy link
Member

xukai92 commented Oct 16, 2019

I left some comments on how to proceed.

@xukai92
Copy link
Member

xukai92 commented Apr 9, 2020

The commit history is messed up. Please rebase and force-push.

ToDo: computing stochastic gradients.
Copy link
Author

@sivapvarma sivapvarma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xukai92 I have added the updates for SGHMC.

I am still stuck about how we compute the stochastic gradients. My main source of confusion is that there are many AD frameworks supported by Turing ( ForwardDiff, ReverseDiff, Zygote, Tracker). I know Zygote is the way forward. But I keep getting lost in how all of them are handled by Turing. Any pointers to AD documentation for Turing would help.

Furthermore, we need to compute gradients on minibatches, so it is still more details of how the interface should be designed.

@xukai92
Copy link
Member

xukai92 commented Apr 13, 2020

I am still stuck about how we compute the stochastic gradients. My main source of confusion is that there are many AD frameworks supported by Turing ( ForwardDiff, ReverseDiff, Zygote, Tracker). I know Zygote is the way forward. But I keep getting lost in how all of them are handled by Turing. Any pointers to AD documentation for Turing would help.

I think the design here is Turing/AD-agonistic. All we need to assume is that Hamiltonian.∂ℓπ∂θ returns a tuple of value and gradient.

Furthermore, we need to compute gradients on minibatches, so it is still more details of how the interface should be designed.

I guess there are two options:

  1. Assume the gradient is scaled correctly by users when implementing ∂ℓπ∂θ or
  2. Include the batch size (M) and whole dataset size (N) in SGLD/SGHMC and we scale it inside AHMC

@yebai yebai closed this Apr 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants