Skip to content

Commit 0e4de78

Browse files
authored
Add PAV-adjusted calibration plot (#108)
* add pav-adjusted calibration plot * fix caption * update requirements.txt
1 parent b1edeea commit 0e4de78

File tree

3 files changed

+41
-15
lines changed

3 files changed

+41
-15
lines changed

Chapters/Prior_posterior_predictive_checks.qmd

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -411,20 +411,18 @@ Both models predict more values in the tail than observed, even if with low prob
411411

412412
### Posterior predictive checks for binary data
413413

414-
Binary data is a common form of discrete data, often used to represent outcomes like yes/no, success/failure, or 0/1. Modelling binary data poses a unique challenge for assessing model fit because these models generate predicted values on a probability scale (0-1), while the actual values of the response variable are dichotomous (either 0 or 1).
414+
Binary data is a common form of discrete data, often used to represent outcomes like yes/no, success/failure, or 0/1. We may be tempted to asses the fit of a binary model using a bar plot, or a plot similar to the rootogram we showed in the previous section, but this is not a good idea. The reason is that even a very simple model with one parameter corresponding to the proportion of one class, can perfectly model the proportion, and a bar plot will not show any deviation [@Säilynoja_2025].
415415

416-
One solution to this challenge was presented by [@Greenhill_2011] and is a know as separation plot. This graphical tool consists of a sequence of bars, where each bar represents a data point. Bars can have one of two colours, one for positive cases and one for negative cases. The bars are sorted by the predicted probabilities, so that the bars with the lowest predicted probabilities are on the left and the bars with the highest predicted probabilities are on the right. Usually the plot also includes a marker showing the expected number of total events. For and ideal fit all the bars with one color should be on one side of the marker and all the bars with the other color on the other side.
416+
One solution to this challenge is to use the so call calibration or reliability plots. To create this kind of plot we first bin the predicted probabilities (e.g., [0.0–0.1], [0.1–0.2], ..., [0.9–1.0]) and then for each bin we compute the fraction of observed positive outcomes. In this way we can compare the predicted probabilities to the observed frequencies. The ideal calibration plot is a diagonal line, where the predicted probabilities are equal to the observed frequencies.
417417

418-
The following example show a separation plot for a logistic regression model.
418+
The problem with this approach is that in practice we don't have good rules to select the bins and different bins can result in plots that look drastically different [@Dimitriadis_2021]. An alternative is to use the method proposed by @Dimitriadis_2021. This method uses conditional event probabilities (CEP), that is the probability that a certain event occurs given that the classifier has assigned a specific predicted probability. To compute the CEPs, the authors use the pool adjacent violators (PAV) algorithm [@Ayer_1955], which provides a way to assign CEPs that are monotonic (i.e. they increase or stay the same, but never decrease) with respect to the model predictions. This monotonicity assumption is reasonable for calibrated models, where higher predicted probabilities should correspond to higher actual event probabilities.
419+
420+
@fig-ppc_pava shows a calibration plot for a dummy logistic regression model. As previously mentioned, the ideal calibration plot is a diagonal line, where the predicted probabilities are equal to the observed frequencies. If the line is above the diagonal, the model is underestimating the probabilities, and if the line is below the diagonal, the model is overestimating the probabilities. The plot also includes the confidence bands for the CEPs. The confidence bands are computed using the method proposed by @Dimitriadis_2021.
419421

420422
```{python}
421-
#| label: fig-post_pred_sep
422-
#| fig-cap: "Separation plot for a dummy logistic regression model."
423-
idata = az.load_arviz_data('classification10d')
424-
425-
az.plot_separation(idata=idata,
426-
y='outcome',
427-
y_hat='outcome',
428-
expected_events=True,
429-
figsize=(10, 1))
423+
#| label: fig-ppc_pava
424+
#| fig-cap: "PAV-adjusted Calibration plot for a dummy logistic regression model."
425+
dt = azb.load_arviz_data('classification10d')
426+
427+
azp.plot_ppc_pava(dt)
430428
```

references.bib

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -550,4 +550,32 @@ @article{Gelman_2013b
550550
year = {2013},
551551
doi = {10.1214/13-EJS854},
552552
URL = {https://doi.org/10.1214/13-EJS854}
553+
}
554+
555+
@article{Dimitriadis_2021,
556+
title = {Stable reliability diagrams for probabilistic classifiers},
557+
volume = {118},
558+
url = {https://www.pnas.org/doi/abs/10.1073/pnas.2016191118},
559+
doi = {10.1073/pnas.2016191118},
560+
number = {8},
561+
urldate = {2025-03-07},
562+
journal = {Proceedings of the National Academy of Sciences},
563+
author = {Dimitriadis, Timo and Gneiting, Tilmann and Jordan, Alexander I.},
564+
month = feb,
565+
year = {2021},
566+
note = {Publisher: Proceedings of the National Academy of Sciences},
567+
pages = {e2016191118},
568+
}
569+
570+
@article{Ayer_1955,
571+
author = {Miriam Ayer and H. D. Brunk and G. M. Ewing and W. T. Reid and Edward Silverman},
572+
title = {{An Empirical Distribution Function for Sampling with Incomplete Information}},
573+
volume = {26},
574+
journal = {The Annals of Mathematical Statistics},
575+
number = {4},
576+
publisher = {Institute of Mathematical Statistics},
577+
pages = {641 -- 647},
578+
year = {1955},
579+
doi = {10.1214/aoms/1177728423},
580+
URL = {https://doi.org/10.1214/aoms/1177728423}
553581
}

requirements.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,6 @@ arviz-stats @ git+https://github.com/arviz-devs/arviz-stats
44
arviz-plots @ git+https://github.com/arviz-devs/arviz-plots
55
bambi==0.15.0
66
kulprit @ git+https://github.com/bambinos/kulprit
7-
preliz==0.15.0
8-
pymc==5.21.0
9-
pymc-bart==0.8.2
7+
preliz==0.16.0
8+
pymc==5.21.1
9+
pymc-bart==0.9.0

0 commit comments

Comments
 (0)