Skip to content

Add support for inhomogeneous parameters in LinearGaussianConjugateSSM.fit_blocked_gibbs #403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

hylkedonker
Copy link

@hylkedonker hylkedonker commented Mar 2, 2025

This PR aims to address #402. In brief, add support for inhomogeneous (i.e., time-varying) parameter Gibbs sampling in the linear Gaussian conjugate state space model LinearGaussianConjugateSSM.
The primary code change in LinearGaussianConjugateSSM.fit_blocked_gibbs is this:

  • Compute the summary statistics per time point $t$.
  • Sample the the parameters
    • Inhomogeneous model: Sample a separate $[F_t, B_t, b_t]^T$ and $[H_t, D_t, d_t]^T$ from a multivariate normal inverse Wishart (MNIW) posterior for each time point $t$.
    • Homogeneous (steady-state) model: Aggretate the summary statistics from all time points, then sample $[F, B, b]^T$ and $[H, D, d]^T$ from the MNIW.
  • Includes a (fairly trivial) unit test that checks the sampled shapes. (A more comprehensive Markov chain Monte Carlo test would involve simulation based calibration, but these take too long to be included in a test suite.)

Scope:

  • Currently, uses the same multivariate normal inverse Wishart prior for all time points.
  • Either considers all dynamics and emissions weights and biases as time-dependent, or all time-independent. For instance, this PR doesn't yet provide support for time-independent emissions with time-dependent dynamics (or vice versa).
  • Algorithms other than fit_blocked_gibbs -- e.g., EM -- are out of scope of this PR.

Let me know if this looks good, or if you require any modifications.

@hylkedonker hylkedonker force-pushed the feature/inhomogeneous-lgcssm-gibbs branch from 5202215 to 00f6ee1 Compare March 24, 2025 14:26
@slinderman
Copy link
Collaborator

@hylkedonker, maybe I'm a bit confused about the proposal. If each time step has its own parameters and there's no tying of parameters in the prior (e.g., via a hierarchical model, a discrete switching model, or a slowly varying model), how do you estimate the per-timestep parameters? If you only get one observation of $(z_t, y_t)$, then it seems you don't have enough information to infer the emissions and dynamics parameters. Could you explain in more detail how this works?

@hylkedonker
Copy link
Author

hylkedonker commented Mar 26, 2025

I had in mind that there is not just one sequence. But rather, a dataset of $i=1,\dots,m$ training examples (or, replicates), with input $u^{(i)}_t$ for each time point $t=1\dots T$. Thus, each dynamic and emission parameter can learn from $m$ training examples.

Concretely, for this pull request the generative model is as follows.
For the time-varying dynamic parameters, take the prior:

$$Q_t, W_{zt} \sim \mathcal{MN}\mathcal{W}^{-1} (M_{zt0}, V_{zt0}, \nu_{qt0}, \psi_{qt0})$$

for $t = 2\dots T$ with $W_{zt} \equiv [F_t, B_t, b_t]$, while for the emission parameters let

$$R_t, W_{yt} \sim \mathcal{MN}\mathcal{W}^{-1} (M_{yt0}, V_{yt0}, \nu_{rt0}, \psi_{rt0})$$

running from $t = 1\dots T$ and $W_{yt} = [H_t, D_t, d_t]$.
(In this pull request, the priors for $M_{z/y t0}$, $V_{z/yt0}$, $\nu_{q/rt0}$ and $\psi_{q/rt0}$ are the same for all values of $t$. But this can easily be generalized). Further, for each training example $i=1\dots m$

$$\begin{aligned} z^{(i)}_t \sim \mathcal{N}(F_t z^{(i)}_{t-1} + B_t u^{(i)}_t + b_t, Q_t) \equiv \mathcal{N}(W_{zt} x^{(i)}_{zt}, Q_t)\\\ y^{(i)}_t \sim \mathcal{N}(H_t z^{(i)}_{t} + D_t u^{(i)}_t + d_t, R_t) \equiv \mathcal{N}(W_{yt} x^{(i)}_{yt}, R_t) \end{aligned}$$

where we defined $x^{(i)}_{zt} = [z^{(i)}_{t-1}, u^{(i)}_t, 1]^\intercal$ and $x^{(i)}_{yt} = [z^{(i)}_{t}, u^{(i)}_t, 1]^\intercal$. Pooling information from the $i=1\dots m$ training examples

$$\begin{aligned} S_{zt}^{xx} = V_{zt0} + \sum_{i=1}^{m} x^{(i)}_{zt} {x^{(i)}_{zt}}^\intercal\\\ S_{zt}^{yy} = M_{zt0} V_{zt0} M_{zt0}^\intercal + \sum_{i=1}^{m} z^{(i)}_{t} {z^{(i)}_{t}}^\intercal\\\ S_{zt}^{yx} = M_{zt0} V_{zt0} + \sum_{i=1}^{m} z^{(i)}_{t} {x^{(i)}_{t}}^\intercal \\\ S_{zt}^{y|x} = S_{zt}^{yx} - S_{zt}^{yx} (S_{zt}^{xx})^{-1} {S_{zt}^{yx}}^\intercal \end{aligned}$$

Then the posterior

$$Q_t, [F_t, B_t, b_t] \mid (z^{(1)}_{t-1}, u^{(1)}_{t}, z^{(1)}_{t})\dots (z^{(m)}_{t-1}, u^{(m)}_{t}, z^{(m)}_{t}) \sim \mathcal{MN}\mathcal{W}^{-1} (M_{ztm}, V_{ztm}, \nu_{qtm}, \psi_{qtm})$$

where

$$\begin{aligned} M_{ztm} = S_{zt}^{yx} (S_{zt}^{xx})^{-1}\\\ V_{ztm} = S_{zt}^{xx} \\\ \nu_{qtm} = \nu_{qt0} + m\\\ \psi_{qtm} = \psi_{qt0} + S_{zt}^{y|x} \end{aligned}$$

In the special case that there is just the one sequence (i.e., $m=1$), there is indeed very little to learn (like you remarked).
The posterior of

$$R_t, [H_t, D_t, d_t] \mid (u^{(1)}_t, z^{(1)}_t, y^{(1)}_t)\dots (u^{(m)}_t, z^{(m)}_t, y^{(m)}_t) \sim \mathcal{MN}\mathcal{W}^{-1} (M_{ytm}, V_{ytm}, \nu_{rtm}, \Psi_{rtm})$$

is given in a similar way upon substituting $x^{(i)}_{zt} \rightarrow x^{(i)}_{yt}$, $W_{zt} \rightarrow W_{yt}$, and $z^{(i)}_t \rightarrow y_t^{(i)}$.

When the model is stationary/homogeneous/time-independent, then the only change is that we drop the time index and extend the sum over training examples and time points. For example,

$$S_{z}^{xx} = V_{z0} + \sum_{t=2}^T \sum_{i=1}^{m} x^{(i)}_{zt} {x^{(i)}_{zt}}^\intercal$$

I hope this clarifies the proposal.

Other thoughts:

  • If you're interested in moving forward with this PR, then we might also consider adapting the following functions for uniformity of the interface:
    • log_prior,
    • m_step
    • posterior_predictive
    • {transition,emission}_distribution
  • Alternatively to expanding the current LinearGaussianConjugateSSM class, we could also define a new class specifically for inhomogeneous LG(C)SSM, leaving the existing LinearGaussianConjugateSSM model as is (for homogeneous models).
  • It looks like I need to resolve some conflicts. :-)

I look forward to hearing your thoughts.

@hylkedonker hylkedonker force-pushed the feature/inhomogeneous-lgcssm-gibbs branch from 00f6ee1 to e4f556a Compare March 27, 2025 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants