[Stack 11/27] Fix D4: pseudocount formula#2435
Conversation
19375f2 to
b105668
Compare
There was a problem hiding this comment.
Pull request overview
Updates Python’s repness pseudocount smoothing constant to match Clojure’s implementation, and re-enables previously xfailed parity tests to enforce the corrected formula going forward.
Changes:
- Set
PSEUDO_COUNT = 2.0inrepness.py(aligningpa/pdsmoothing with Clojure’s(na+1)/(ns+2)). - Un-xfail D4 pseudocount discrepancy tests now that parity is achieved.
- Remove remaining hardcoded pseudocount values in unit tests and update docs/journal accordingly.
Reviewed changes
Copilot reviewed 8 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| delphi/polismath/pca_kmeans_rep/repness.py | Updates pseudocount constant and explanatory comment for smoothing. |
| delphi/tests/test_discrepancy_fixes.py | Removes xfail markers for D4 tests so they now enforce parity. |
| delphi/tests/test_repness_unit.py | Uses PSEUDO_COUNT from production code instead of hardcoding. |
| delphi/tests/test_old_format_repness.py | Same as above for the old-format interface tests. |
| delphi/tests/simplified_repness_test.py | Updates the hardcoded pseudocount constant in the simplified script. |
| delphi/docs/PLAN_DISCREPANCY_FIXES.md | Marks D4 as DONE in the plan. |
| delphi/docs/HANDOFF_REGRESSION_TEST_PERF.md | Adds performance investigation notes (handoff doc). |
| delphi/docs/CLJ-PARITY-FIXES-JOURNAL.md | Journals the D4 fix steps and notes the perf investigation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cb557b3 to
f0516e8
Compare
f0516e8 to
ebd71ca
Compare
ebd71ca to
758355c
Compare
34fa217 to
62f305c
Compare
758355c to
35d24b1
Compare
62f305c to
cf43dcf
Compare
35d24b1 to
d295389
Compare
cf43dcf to
d779d86
Compare
d295389 to
7e6ccc1
Compare
7e6ccc1 to
09a5e5e
Compare
2bc5575 to
4aceb77
Compare
303fd4e to
081bdb0
Compare
4aceb77 to
f9ea97c
Compare
6093159 to
cca501b
Compare
820aaaf to
aabb481
Compare
49be8f6 to
c620c1e
Compare
aabb481 to
9bf6805
Compare
c620c1e to
707c63d
Compare
9bf6805 to
21abf22
Compare
707c63d to
ddb4e01
Compare
21abf22 to
8a05bd9
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 10 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| **Current**: `PSEUDO_COUNT = 1.5` → `pa = (na + 0.75) / (ns + 1.5)` (Beta(1.75,1.75) prior) | ||
| **Target**: `PSEUDO_COUNT = 2.0` → `pa = (na + 1) / (ns + 2)` (Beta(2,2) prior, matching Clojure) | ||
| **Current**: ~~`PSEUDO_COUNT = 1.5`~~ → **DONE**: `PSEUDO_COUNT = 2.0` → `pa = (na + 1) / (ns + 2)` (Beta(2,2) prior, matching Clojure) |
There was a problem hiding this comment.
This plan entry labels pa=(na+1)/(ns+2) as a "Beta(2,2) prior", but that formula corresponds to Beta(1,1) under the standard Beta-binomial parameterization (since α=1,β=1 gives +1 numerator and +2 denominator). Please fix the prior name in the plan so future parity work doesn’t target the wrong formula.
| **Current**: ~~`PSEUDO_COUNT = 1.5`~~ → **DONE**: `PSEUDO_COUNT = 2.0` → `pa = (na + 1) / (ns + 2)` (Beta(2,2) prior, matching Clojure) | |
| **Current**: ~~`PSEUDO_COUNT = 1.5`~~ → **DONE**: `PSEUDO_COUNT = 2.0` → `pa = (na + 1) / (ns + 2)` (Beta(1,1) prior, matching Clojure) |
| 7. Re-recorded golden snapshots for all 7 datasets | ||
|
|
||
| ### Changes | ||
| - `repness.py`: `PSEUDO_COUNT = 2.0`, updated comment to reference Beta(2,2) prior |
There was a problem hiding this comment.
This journal entry says the repness pseudocount change references a "Beta(2,2) prior", but the implemented formula pa=(na+1)/(ns+2) corresponds to Beta(1,1) (add-one smoothing) in standard terminology. Please update the prior name here to avoid cementing an incorrect statistical interpretation in the handoff docs.
| - `repness.py`: `PSEUDO_COUNT = 2.0`, updated comment to reference Beta(2,2) prior | |
| - `repness.py`: `PSEUDO_COUNT = 2.0`, updated comment to reference a Beta(1,1) (add-one) prior |
| # - This pulls probabilities toward 0.5 (the prior), with the effect diminishing | ||
| # as sample size grows | ||
| # - With PSEUDO_COUNT = 2.0, we add 1 "virtual" agree and 1 "virtual" disagree | ||
| # to each comment's vote count — equivalent to a Beta(2,2) prior |
There was a problem hiding this comment.
The comment claims the (+1 agree/+1 disagree) smoothing is equivalent to a Beta(2,2) prior, but the implemented formula (na + PSEUDO_COUNT/2)/(ns + PSEUDO_COUNT) with PSEUDO_COUNT=2 corresponds to Beta(1,1) (uniform) in the standard Beta-binomial parameterization. Please either fix the wording (e.g., Beta(1,1) / add-one smoothing) or, if Beta(2,2) is truly intended, adjust the formula/constant accordingly (would be +2 numerator, +4 denominator).
| # to each comment's vote count — equivalent to a Beta(2,2) prior | |
| # to each comment's vote count — equivalent to a Beta(1,1) (uniform) prior | |
| # / standard add-one smoothing |
| """ | ||
| D4: Python uses PSEUDO_COUNT = 1.5 → pa = (na + 0.75) / (ns + 1.5) | ||
| Clojure uses PSEUDO_COUNT = 2.0 → pa = (na + 1) / (ns + 2) | ||
| """ |
There was a problem hiding this comment.
This D4 class docstring still states "Python uses PSEUDO_COUNT = 1.5" even though the constant has now been changed to 2.0. Since these tests are now validating the post-fix behavior, update the docstring to describe the historical discrepancy (before fix) or to describe the current aligned formula for both implementations.
8a05bd9 to
adac3bb
Compare
ddb4e01 to
8fa1cf3
Compare
The pseudocount constant controls Bayesian smoothing of vote probabilities. Python used 1.5 (Beta(1.75,1.75)), Clojure uses 2.0 (Beta(2,2)). This changes pa = (na + 0.75)/(ns + 1.5) to pa = (na + 1)/(ns + 2), matching Clojure's repness.clj exactly. Changes: - repness.py: PSEUDO_COUNT = 2.0, updated comment - test_discrepancy_fixes.py: remove xfail from 3 D4 tests - test_repness_unit.py, test_old_format_repness.py: use PSEUDO_COUNT import instead of hardcoded 1.5 - simplified_repness_test.py: update hardcoded constant - Golden snapshots re-recorded for vw and biodiversity TDD: red (6 D4 tests failed) → fix → green (all 6 pass) Full suite: 258 passed, 3 skipped, 30 xfailed, 0 failures Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
adac3bb to
d213a79
Compare
8fa1cf3 to
5893382
Compare
Delphi Coverage Report
|
|
Superseded by spr-managed PR stack. See the new stack starting at #2508. |
Summary
PSEUDO_COUNTfrom 1.5 to 2.0, matching Clojure's Beta(2,2) priorpa = (na + 0.75)/(ns + 1.5)topa = (na + 1)/(ns + 2)pa/pdvalues now match Clojure'sp-successexactly (verified on all datasets with Clojure blobs)Changes
repness.py:PSEUDO_COUNT = 2.0with updated commenttest_discrepancy_fixes.py: remove xfail from 3 D4 tests (constant check, pa values per dataset, synthetic)test_repness_unit.py,test_old_format_repness.py: importPSEUDO_COUNTinstead of hardcoding 1.5simplified_repness_test.py: update hardcoded constantTest plan
🤖 Generated with Claude Code