Add support for inhomogeneous parameters in `LinearGaussianConjugateSSM.fit_blocked_gibbs` #403

hylkedonker · 2025-03-02T16:40:05Z

This PR aims to address #402. In brief, add support for inhomogeneous (i.e., time-varying) parameter Gibbs sampling in the linear Gaussian conjugate state space model LinearGaussianConjugateSSM.
The primary code change in LinearGaussianConjugateSSM.fit_blocked_gibbs is this:

Compute the summary statistics per time point $t$.
Sample the the parameters
- Inhomogeneous model: Sample a separate $[F_t, B_t, b_t]^T$ and $[H_t, D_t, d_t]^T$ from a multivariate normal inverse Wishart (MNIW) posterior for each time point $t$.
- Homogeneous (steady-state) model: Aggretate the summary statistics from all time points, then sample $[F, B, b]^T$ and $[H, D, d]^T$ from the MNIW.
Includes a (fairly trivial) unit test that checks the sampled shapes. (A more comprehensive Markov chain Monte Carlo test would involve simulation based calibration, but these take too long to be included in a test suite.)

Scope:

Currently, uses the same multivariate normal inverse Wishart prior for all time points.
Either considers all dynamics and emissions weights and biases as time-dependent, or all time-independent. For instance, this PR doesn't yet provide support for time-independent emissions with time-dependent dynamics (or vice versa).
Algorithms other than fit_blocked_gibbs -- e.g., EM -- are out of scope of this PR.

Let me know if this looks good, or if you require any modifications.

slinderman · 2025-03-25T22:07:24Z

@hylkedonker, maybe I'm a bit confused about the proposal. If each time step has its own parameters and there's no tying of parameters in the prior (e.g., via a hierarchical model, a discrete switching model, or a slowly varying model), how do you estimate the per-timestep parameters? If you only get one observation of $(z_t, y_t)$, then it seems you don't have enough information to infer the emissions and dynamics parameters. Could you explain in more detail how this works?

hylkedonker · 2025-03-26T20:33:05Z

I had in mind that there is not just one sequence. But rather, a dataset of $i=1,\dots,m$ training examples (or, replicates), with input $u^{(i)}_t$ for each time point $t=1\dots T$. Thus, each dynamic and emission parameter can learn from $m$ training examples.

Concretely, for this pull request the generative model is as follows.
For the time-varying dynamic parameters, take the prior:

$$Q_t, W_{zt} \sim \mathcal{MN}\mathcal{W}^{-1} (M_{zt0}, V_{zt0}, \nu_{qt0}, \psi_{qt0})$$

for $t = 2\dots T$ with $W_{zt} \equiv [F_t, B_t, b_t]$, while for the emission parameters let

$$R_t, W_{yt} \sim \mathcal{MN}\mathcal{W}^{-1} (M_{yt0}, V_{yt0}, \nu_{rt0}, \psi_{rt0})$$

running from $t = 1\dots T$ and $W_{yt} = [H_t, D_t, d_t]$.
(In this pull request, the priors for $M_{z/y t0}$, $V_{z/yt0}$, $\nu_{q/rt0}$ and $\psi_{q/rt0}$ are the same for all values of $t$. But this can easily be generalized). Further, for each training example $i=1\dots m$

$$\begin{aligned} z^{(i)}_t \sim \mathcal{N}(F_t z^{(i)}_{t-1} + B_t u^{(i)}_t + b_t, Q_t) \equiv \mathcal{N}(W_{zt} x^{(i)}_{zt}, Q_t)\\\ y^{(i)}_t \sim \mathcal{N}(H_t z^{(i)}_{t} + D_t u^{(i)}_t + d_t, R_t) \equiv \mathcal{N}(W_{yt} x^{(i)}_{yt}, R_t) \end{aligned}$$

where we defined $x^{(i)}_{zt} = [z^{(i)}_{t-1}, u^{(i)}_t, 1]^\intercal$ and $x^{(i)}_{yt} = [z^{(i)}_{t}, u^{(i)}_t, 1]^\intercal$. Pooling information from the $i=1\dots m$ training examples

$$\begin{aligned} S_{zt}^{xx} = V_{zt0} + \sum_{i=1}^{m} x^{(i)}_{zt} {x^{(i)}_{zt}}^\intercal\\\ S_{zt}^{yy} = M_{zt0} V_{zt0} M_{zt0}^\intercal + \sum_{i=1}^{m} z^{(i)}_{t} {z^{(i)}_{t}}^\intercal\\\ S_{zt}^{yx} = M_{zt0} V_{zt0} + \sum_{i=1}^{m} z^{(i)}_{t} {x^{(i)}_{t}}^\intercal \\\ S_{zt}^{y|x} = S_{zt}^{yx} - S_{zt}^{yx} (S_{zt}^{xx})^{-1} {S_{zt}^{yx}}^\intercal \end{aligned}$$

Then the posterior

$$Q_t, [F_t, B_t, b_t] \mid (z^{(1)}_{t-1}, u^{(1)}_{t}, z^{(1)}_{t})\dots (z^{(m)}_{t-1}, u^{(m)}_{t}, z^{(m)}_{t}) \sim \mathcal{MN}\mathcal{W}^{-1} (M_{ztm}, V_{ztm}, \nu_{qtm}, \psi_{qtm})$$

where

$$\begin{aligned} M_{ztm} = S_{zt}^{yx} (S_{zt}^{xx})^{-1}\\\ V_{ztm} = S_{zt}^{xx} \\\ \nu_{qtm} = \nu_{qt0} + m\\\ \psi_{qtm} = \psi_{qt0} + S_{zt}^{y|x} \end{aligned}$$

In the special case that there is just the one sequence (i.e., $m=1$), there is indeed very little to learn (like you remarked).
The posterior of

$$R_t, [H_t, D_t, d_t] \mid (u^{(1)}_t, z^{(1)}_t, y^{(1)}_t)\dots (u^{(m)}_t, z^{(m)}_t, y^{(m)}_t) \sim \mathcal{MN}\mathcal{W}^{-1} (M_{ytm}, V_{ytm}, \nu_{rtm}, \Psi_{rtm})$$

is given in a similar way upon substituting $x^{(i)}_{zt} \rightarrow x^{(i)}_{yt}$, $W_{zt} \rightarrow W_{yt}$, and $z^{(i)}_t \rightarrow y_t^{(i)}$.

When the model is stationary/homogeneous/time-independent, then the only change is that we drop the time index and extend the sum over training examples and time points. For example,

$$S_{z}^{xx} = V_{z0} + \sum_{t=2}^T \sum_{i=1}^{m} x^{(i)}_{zt} {x^{(i)}_{zt}}^\intercal$$

I hope this clarifies the proposal.

Other thoughts:

If you're interested in moving forward with this PR, then we might also consider adapting the following functions for uniformity of the interface:
- log_prior,
- m_step
- posterior_predictive
- {transition,emission}_distribution
Alternatively to expanding the current LinearGaussianConjugateSSM class, we could also define a new class specifically for inhomogeneous LG(C)SSM, leaving the existing LinearGaussianConjugateSSM model as is (for homogeneous models).
It looks like I need to resolve some conflicts. :-)

I look forward to hearing your thoughts.

hylkedonker force-pushed the feature/inhomogeneous-lgcssm-gibbs branch from 5202215 to 00f6ee1 Compare March 24, 2025 14:26

Add support for inhomogeneous parameters

e4f556a

hylkedonker force-pushed the feature/inhomogeneous-lgcssm-gibbs branch from 00f6ee1 to e4f556a Compare March 27, 2025 20:30

(auto) Paper PDF Draft

6d9635a

hylkedonker mentioned this pull request Apr 18, 2025

[BUG] Incorrect summary statistics in LinearGaussianConjugateSSM.fit_blocked_gibbs #416

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for inhomogeneous parameters in `LinearGaussianConjugateSSM.fit_blocked_gibbs` #403

Add support for inhomogeneous parameters in `LinearGaussianConjugateSSM.fit_blocked_gibbs` #403

Uh oh!

hylkedonker commented Mar 2, 2025 •

edited

Loading

Uh oh!

slinderman commented Mar 25, 2025

Uh oh!

hylkedonker commented Mar 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Add support for inhomogeneous parameters in LinearGaussianConjugateSSM.fit_blocked_gibbs #403

Are you sure you want to change the base?

Add support for inhomogeneous parameters in LinearGaussianConjugateSSM.fit_blocked_gibbs #403

Uh oh!

Conversation

hylkedonker commented Mar 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slinderman commented Mar 25, 2025

Uh oh!

hylkedonker commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Add support for inhomogeneous parameters in `LinearGaussianConjugateSSM.fit_blocked_gibbs` #403

Add support for inhomogeneous parameters in `LinearGaussianConjugateSSM.fit_blocked_gibbs` #403

hylkedonker commented Mar 2, 2025 •

edited

Loading

hylkedonker commented Mar 26, 2025 •

edited

Loading