Skip to content

[Stack 11/27] Fix D4: pseudocount formula#2435

Closed
jucor wants to merge 2 commits into
jc/clj-parity-d2-fixfrom
jc/clj-parity-d4-fix
Closed

[Stack 11/27] Fix D4: pseudocount formula#2435
jucor wants to merge 2 commits into
jc/clj-parity-d2-fixfrom
jc/clj-parity-d4-fix

Conversation

@jucor
Copy link
Copy Markdown
Collaborator

@jucor jucor commented Mar 11, 2026

Summary

Stacked on #2421 (Fix D2: in-conv participant threshold + D2c vote count source). Please review and merge #2421 first.
Next in stack: #2436 (Speed up regression tests)

  • Change PSEUDO_COUNT from 1.5 to 2.0, matching Clojure's Beta(2,2) prior
  • This changes probability smoothing from pa = (na + 0.75)/(ns + 1.5) to pa = (na + 1)/(ns + 2)
  • All pa/pd values now match Clojure's p-success exactly (verified on all datasets with Clojure blobs)

Changes

  • repness.py: PSEUDO_COUNT = 2.0 with updated comment
  • test_discrepancy_fixes.py: remove xfail from 3 D4 tests (constant check, pa values per dataset, synthetic)
  • test_repness_unit.py, test_old_format_repness.py: import PSEUDO_COUNT instead of hardcoding 1.5
  • simplified_repness_test.py: update hardcoded constant
  • Golden snapshots re-recorded for public datasets (vw, biodiversity)

Test plan

  • TDD red: 6 D4 tests fail before fix
  • TDD green: all 6 D4 tests pass after fix
  • Full public suite: 258 passed, 0 failures
  • Private datasets (--include-local): 60 passed, 0 failures (discrepancy tests)
  • Regression tests pass on public + FLI + bg2018

🤖 Generated with Claude Code

@jucor jucor changed the title [Clj parity PR 2] Fix D4: pseudocount formula [Stack 9/9] Fix D4: pseudocount formula Mar 11, 2026
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from 19375f2 to b105668 Compare March 11, 2026 12:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Python’s repness pseudocount smoothing constant to match Clojure’s implementation, and re-enables previously xfailed parity tests to enforce the corrected formula going forward.

Changes:

  • Set PSEUDO_COUNT = 2.0 in repness.py (aligning pa/pd smoothing with Clojure’s (na+1)/(ns+2)).
  • Un-xfail D4 pseudocount discrepancy tests now that parity is achieved.
  • Remove remaining hardcoded pseudocount values in unit tests and update docs/journal accordingly.

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
delphi/polismath/pca_kmeans_rep/repness.py Updates pseudocount constant and explanatory comment for smoothing.
delphi/tests/test_discrepancy_fixes.py Removes xfail markers for D4 tests so they now enforce parity.
delphi/tests/test_repness_unit.py Uses PSEUDO_COUNT from production code instead of hardcoding.
delphi/tests/test_old_format_repness.py Same as above for the old-format interface tests.
delphi/tests/simplified_repness_test.py Updates the hardcoded pseudocount constant in the simplified script.
delphi/docs/PLAN_DISCREPANCY_FIXES.md Marks D4 as DONE in the plan.
delphi/docs/HANDOFF_REGRESSION_TEST_PERF.md Adds performance investigation notes (handoff doc).
delphi/docs/CLJ-PARITY-FIXES-JOURNAL.md Journals the D4 fix steps and notes the perf investigation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread delphi/tests/simplified_repness_test.py
Comment thread delphi/polismath/pca_kmeans_rep/repness.py
Comment thread delphi/tests/test_repness_unit.py
Comment thread delphi/tests/test_old_format_repness.py
@jucor jucor changed the title [Stack 9/9] Fix D4: pseudocount formula [Stack 9/10] Fix D4: pseudocount formula Mar 11, 2026
@jucor jucor changed the title [Stack 9/10] Fix D4: pseudocount formula [Stack 9/11] Fix D4: pseudocount formula Mar 11, 2026
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from cb557b3 to f0516e8 Compare March 13, 2026 13:09
@jucor jucor changed the title [Stack 9/11] Fix D4: pseudocount formula [Stack 9/12] Fix D4: pseudocount formula Mar 13, 2026
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from f0516e8 to ebd71ca Compare March 13, 2026 13:46
@jucor jucor changed the title [Stack 9/12] Fix D4: pseudocount formula [Stack 9/13] Fix D4: pseudocount formula Mar 13, 2026
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from ebd71ca to 758355c Compare March 13, 2026 14:13
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch from 34fa217 to 62f305c Compare March 13, 2026 15:55
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from 758355c to 35d24b1 Compare March 13, 2026 15:56
@jucor jucor changed the title [Stack 9/13] Fix D4: pseudocount formula [Stack 9/15] Fix D4: pseudocount formula Mar 16, 2026
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch from 62f305c to cf43dcf Compare March 16, 2026 16:04
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from 35d24b1 to d295389 Compare March 16, 2026 16:04
@jucor jucor changed the title [Stack 9/15] Fix D4: pseudocount formula [Stack 9/16] Fix D4: pseudocount formula Mar 16, 2026
@jucor jucor changed the title [Stack 9/16] Fix D4: pseudocount formula [Stack 9/17] Fix D4: pseudocount formula Mar 16, 2026
@jucor jucor changed the title [Stack 9/17] Fix D4: pseudocount formula [Stack 9/24] Fix D4: pseudocount formula Mar 17, 2026
@jucor jucor changed the title [Stack 9/24] Fix D4: pseudocount formula [Stack 9/25] Fix D4: pseudocount formula Mar 17, 2026
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch from cf43dcf to d779d86 Compare March 19, 2026 10:43
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from d295389 to 7e6ccc1 Compare March 19, 2026 10:43
@jucor jucor changed the title [Stack 9/25] Fix D4: pseudocount formula [Stack 8/24] Fix D4: pseudocount formula Mar 19, 2026
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from 7e6ccc1 to 09a5e5e Compare March 19, 2026 12:26
@jucor jucor changed the title [Stack 8/24] Fix D4: pseudocount formula [Stack 8/25] Fix D4: pseudocount formula Mar 19, 2026
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch from 2bc5575 to 4aceb77 Compare March 23, 2026 17:47
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from 303fd4e to 081bdb0 Compare March 23, 2026 17:47
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch from 4aceb77 to f9ea97c Compare March 26, 2026 21:24
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch 2 times, most recently from 6093159 to cca501b Compare March 27, 2026 01:15
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch 2 times, most recently from 820aaaf to aabb481 Compare March 27, 2026 02:10
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch 2 times, most recently from 49be8f6 to c620c1e Compare March 27, 2026 10:41
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch from aabb481 to 9bf6805 Compare March 27, 2026 10:41
@jucor jucor changed the title [Stack 9/25] Fix D4: pseudocount formula [Stack 10/26] Fix D4: pseudocount formula Mar 30, 2026
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from c620c1e to 707c63d Compare March 30, 2026 12:48
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch from 9bf6805 to 21abf22 Compare March 30, 2026 12:48
@jucor jucor changed the title [Stack 10/26] Fix D4: pseudocount formula [Stack 11/27] Fix D4: pseudocount formula Mar 30, 2026
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from 707c63d to ddb4e01 Compare March 30, 2026 12:54
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch from 21abf22 to 8a05bd9 Compare March 30, 2026 12:54
@jucor jucor requested a review from Copilot March 30, 2026 16:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 10 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


**Current**: `PSEUDO_COUNT = 1.5` → `pa = (na + 0.75) / (ns + 1.5)` (Beta(1.75,1.75) prior)
**Target**: `PSEUDO_COUNT = 2.0` → `pa = (na + 1) / (ns + 2)` (Beta(2,2) prior, matching Clojure)
**Current**: ~~`PSEUDO_COUNT = 1.5`~~ → **DONE**: `PSEUDO_COUNT = 2.0` → `pa = (na + 1) / (ns + 2)` (Beta(2,2) prior, matching Clojure)
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plan entry labels pa=(na+1)/(ns+2) as a "Beta(2,2) prior", but that formula corresponds to Beta(1,1) under the standard Beta-binomial parameterization (since α=1,β=1 gives +1 numerator and +2 denominator). Please fix the prior name in the plan so future parity work doesn’t target the wrong formula.

Suggested change
**Current**: ~~`PSEUDO_COUNT = 1.5`~~**DONE**: `PSEUDO_COUNT = 2.0``pa = (na + 1) / (ns + 2)` (Beta(2,2) prior, matching Clojure)
**Current**: ~~`PSEUDO_COUNT = 1.5`~~**DONE**: `PSEUDO_COUNT = 2.0``pa = (na + 1) / (ns + 2)` (Beta(1,1) prior, matching Clojure)

Copilot uses AI. Check for mistakes.
7. Re-recorded golden snapshots for all 7 datasets

### Changes
- `repness.py`: `PSEUDO_COUNT = 2.0`, updated comment to reference Beta(2,2) prior
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This journal entry says the repness pseudocount change references a "Beta(2,2) prior", but the implemented formula pa=(na+1)/(ns+2) corresponds to Beta(1,1) (add-one smoothing) in standard terminology. Please update the prior name here to avoid cementing an incorrect statistical interpretation in the handoff docs.

Suggested change
- `repness.py`: `PSEUDO_COUNT = 2.0`, updated comment to reference Beta(2,2) prior
- `repness.py`: `PSEUDO_COUNT = 2.0`, updated comment to reference a Beta(1,1) (add-one) prior

Copilot uses AI. Check for mistakes.
# - This pulls probabilities toward 0.5 (the prior), with the effect diminishing
# as sample size grows
# - With PSEUDO_COUNT = 2.0, we add 1 "virtual" agree and 1 "virtual" disagree
# to each comment's vote count — equivalent to a Beta(2,2) prior
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims the (+1 agree/+1 disagree) smoothing is equivalent to a Beta(2,2) prior, but the implemented formula (na + PSEUDO_COUNT/2)/(ns + PSEUDO_COUNT) with PSEUDO_COUNT=2 corresponds to Beta(1,1) (uniform) in the standard Beta-binomial parameterization. Please either fix the wording (e.g., Beta(1,1) / add-one smoothing) or, if Beta(2,2) is truly intended, adjust the formula/constant accordingly (would be +2 numerator, +4 denominator).

Suggested change
# to each comment's vote count — equivalent to a Beta(2,2) prior
# to each comment's vote count — equivalent to a Beta(1,1) (uniform) prior
# / standard add-one smoothing

Copilot uses AI. Check for mistakes.
Comment on lines 502 to 505
"""
D4: Python uses PSEUDO_COUNT = 1.5 → pa = (na + 0.75) / (ns + 1.5)
Clojure uses PSEUDO_COUNT = 2.0 → pa = (na + 1) / (ns + 2)
"""
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This D4 class docstring still states "Python uses PSEUDO_COUNT = 1.5" even though the constant has now been changed to 2.0. Since these tests are now validating the post-fix behavior, update the docstring to describe the historical discrepancy (before fix) or to describe the current aligned formula for both implementations.

Copilot uses AI. Check for mistakes.
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch from 8a05bd9 to adac3bb Compare March 30, 2026 16:49
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from ddb4e01 to 8fa1cf3 Compare March 30, 2026 16:49
jucor and others added 2 commits March 30, 2026 17:58
The pseudocount constant controls Bayesian smoothing of vote probabilities.
Python used 1.5 (Beta(1.75,1.75)), Clojure uses 2.0 (Beta(2,2)).

This changes pa = (na + 0.75)/(ns + 1.5) to pa = (na + 1)/(ns + 2),
matching Clojure's repness.clj exactly.

Changes:
- repness.py: PSEUDO_COUNT = 2.0, updated comment
- test_discrepancy_fixes.py: remove xfail from 3 D4 tests
- test_repness_unit.py, test_old_format_repness.py: use PSEUDO_COUNT
  import instead of hardcoded 1.5
- simplified_repness_test.py: update hardcoded constant
- Golden snapshots re-recorded for vw and biodiversity

TDD: red (6 D4 tests failed) → fix → green (all 6 pass)
Full suite: 258 passed, 3 skipped, 30 xfailed, 0 failures

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-d2-fix branch from adac3bb to d213a79 Compare March 30, 2026 17:05
@jucor jucor force-pushed the jc/clj-parity-d4-fix branch from 8fa1cf3 to 5893382 Compare March 30, 2026 17:05
@github-actions
Copy link
Copy Markdown

Delphi Coverage Report

File Stmts Miss Cover
init.py 2 0 100%
benchmarks/bench_pca.py 76 76 0%
benchmarks/bench_repness.py 81 81 0%
benchmarks/bench_update_votes.py 38 38 0%
benchmarks/benchmark_utils.py 34 34 0%
components/init.py 1 0 100%
components/config.py 165 133 19%
conversation/init.py 2 0 100%
conversation/conversation.py 1117 328 71%
conversation/manager.py 131 42 68%
database/init.py 1 0 100%
database/dynamodb.py 387 234 40%
database/postgres.py 305 205 33%
pca_kmeans_rep/init.py 5 0 100%
pca_kmeans_rep/clusters.py 257 22 91%
pca_kmeans_rep/corr.py 98 17 83%
pca_kmeans_rep/pca.py 52 16 69%
pca_kmeans_rep/repness.py 361 51 86%
pca_kmeans_rep/stats.py 107 22 79%
regression/init.py 4 0 100%
regression/clojure_comparer.py 188 17 91%
regression/comparer.py 887 720 19%
regression/datasets.py 135 27 80%
regression/recorder.py 36 27 25%
regression/utils.py 137 118 14%
run_math_pipeline.py 260 114 56%
umap_narrative/500_generate_embedding_umap_cluster.py 210 109 48%
umap_narrative/501_calculate_comment_extremity.py 112 54 52%
umap_narrative/502_calculate_priorities.py 135 135 0%
umap_narrative/700_datamapplot_for_layer.py 502 502 0%
umap_narrative/701_static_datamapplot_for_layer.py 310 310 0%
umap_narrative/702_consensus_divisive_datamapplot.py 432 432 0%
umap_narrative/801_narrative_report_batch.py 785 785 0%
umap_narrative/802_process_batch_results.py 265 265 0%
umap_narrative/803_check_batch_status.py 175 175 0%
umap_narrative/llm_factory_constructor/init.py 2 2 0%
umap_narrative/llm_factory_constructor/model_provider.py 157 157 0%
umap_narrative/polismath_commentgraph/init.py 1 0 100%
umap_narrative/polismath_commentgraph/cli.py 270 270 0%
umap_narrative/polismath_commentgraph/core/init.py 3 3 0%
umap_narrative/polismath_commentgraph/core/clustering.py 108 108 0%
umap_narrative/polismath_commentgraph/core/embedding.py 104 104 0%
umap_narrative/polismath_commentgraph/lambda_handler.py 219 219 0%
umap_narrative/polismath_commentgraph/schemas/init.py 2 0 100%
umap_narrative/polismath_commentgraph/schemas/dynamo_models.py 160 9 94%
umap_narrative/polismath_commentgraph/tests/conftest.py 17 17 0%
umap_narrative/polismath_commentgraph/tests/test_clustering.py 74 74 0%
umap_narrative/polismath_commentgraph/tests/test_embedding.py 55 55 0%
umap_narrative/polismath_commentgraph/tests/test_storage.py 87 87 0%
umap_narrative/polismath_commentgraph/utils/init.py 3 0 100%
umap_narrative/polismath_commentgraph/utils/converter.py 283 237 16%
umap_narrative/polismath_commentgraph/utils/group_data.py 354 336 5%
umap_narrative/polismath_commentgraph/utils/storage.py 584 477 18%
umap_narrative/reset_conversation.py 159 50 69%
umap_narrative/run_pipeline.py 453 312 31%
utils/general.py 62 41 34%
Total 10950 7647 30%

This was referenced Mar 30, 2026
@jucor
Copy link
Copy Markdown
Collaborator Author

jucor commented Mar 30, 2026

Superseded by spr-managed PR stack. See the new stack starting at #2508.

@jucor jucor closed this Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants