Skip to content

Generalize r2_score #48

Open
Open
@sethaxen

Description

@sethaxen

Background

Looking at the Bayesian $$R^2$$ paper (preprint) and supplement, the result of the approach is posterior draws of the Bayesian $$R^2$$, which we compute with the internal function r2_sample. We then return the mean and standard deviation of the distribution with r2_score.

Similarly, the supplement linked above introduces an approach for estimating a LOO- $$R^2$$. Approximate posterior draws of this estimate are obtained using the Bayesian bootstrap.

For multivariate models, $$R^2$$ can be computed pointwise for each reponse variable, while we currently only support univariate response variables.

Proposal

I propose the following improvements:

  • Rename r2_sample to r2_score so that we return the MCMC draws. (a user can trivially pass these to summarize to get whichever summary statistics they'd like). Note that this differs from Python ArviZ.
  • Generalize r2_score to support multiple outputs.

Misc

Something like loo_r2_score would be nice to have as well, though it's worth thinking carefully about the API here and the core primitives we need to support similar functionality for other predictive metrics. Some metrics just compare the LOO-predictive mean with the data so just require posterior predictions, data, and log-likelihood evaluations and could accept an arbitrary user-provided binary metric (see e.g. https://mc-stan.org/loo/reference/loo_predictive_metric.html). Others like CRPS (https://mc-stan.org/loo/reference/crps.html) are computed pointwise and may require additional inputs (e.g. CRPS requires 2 independent predictive draws per posterior draw), and it's difficult to unify these into a single function. In each of these cases, there's a way to compute the metric both on the posterior and on the LOO-posteriors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions