Skip to content

Fix D4: pseudocount formula#2514

Open
jucor wants to merge 1 commit into
spr/edge/c0a682ecfrom
spr/edge/6ae3ee43
Open

Fix D4: pseudocount formula#2514
jucor wants to merge 1 commit into
spr/edge/c0a682ecfrom
spr/edge/6ae3ee43

Conversation

@jucor
Copy link
Copy Markdown
Collaborator

@jucor jucor commented Mar 30, 2026

Summary

  • Change PSEUDO_COUNT from 1.5 to 2.0, matching Clojure's Beta(2,2) prior
  • This changes probability smoothing from pa = (na + 0.75)/(ns + 1.5) to pa = (na + 1)/(ns + 2)
  • All pa/pd values now match Clojure's p-success exactly (verified on all datasets with Clojure blobs)

Changes

  • repness.py: PSEUDO_COUNT = 2.0 with updated comment
  • test_discrepancy_fixes.py: remove xfail from 3 D4 tests (constant check, pa values per dataset, synthetic)
  • test_repness_unit.py, test_old_format_repness.py: import PSEUDO_COUNT instead of hardcoding 1.5
  • simplified_repness_test.py: update hardcoded constant
  • Golden snapshots re-recorded for public datasets (vw, biodiversity)

Test plan

  • TDD red: 6 D4 tests fail before fix
  • TDD green: all 6 D4 tests pass after fix
  • Full public suite: 258 passed, 0 failures
  • Private datasets (--include-local): 60 passed, 0 failures (discrepancy tests)
  • Regression tests pass on public + FLI + bg2018

🤖 Generated with Claude Code

Squashed commits

  • Fix D4: PSEUDO_COUNT 1.5 → 2.0 to match Clojure's Beta(2,2) prior
  • Journal: add session 6 (D4 fix), update plan marking D4 done

commit-id:6ae3ee43


Stack:


⚠️ Part of a stack created by spr. Do not merge manually using the UI - doing so may have unexpected results.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Delphi’s representativeness (“repness”) probability smoothing constant to match Clojure’s implementation, so computed pa/pd values align exactly across languages.

Changes:

  • Set PSEUDO_COUNT to 2.0 in repness.py and update the explanatory comment.
  • Un-xfail D4 discrepancy tests now that the constant matches Clojure.
  • Remove remaining hardcoded 1.5 pseudocount usage in unit/compat tests (import PSEUDO_COUNT instead) and update the simplified repness script constant.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
delphi/polismath/pca_kmeans_rep/repness.py Change pseudocount constant to 2.0 and update inline rationale.
delphi/tests/test_discrepancy_fixes.py Remove xfails for D4 tests that now pass with updated pseudocount.
delphi/tests/test_repness_unit.py Use PSEUDO_COUNT constant in expected-value calculation instead of hardcoding.
delphi/tests/test_old_format_repness.py Same: import and use PSEUDO_COUNT for expected pa/pd.
delphi/tests/simplified_repness_test.py Update the simplified script’s hardcoded pseudocount to 2.0.
delphi/docs/PLAN_DISCREPANCY_FIXES.md Mark D4 pseudocount fix as done.
delphi/docs/HANDOFF_REGRESSION_TEST_PERF.md Add handoff doc summarizing regression test perf investigation.
delphi/docs/CLJ-PARITY-FIXES-JOURNAL.md Journal entry documenting the D4 fix and perf side investigation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +26 to +30
# - With PSEUDO_COUNT = 2.0, we add 1 "virtual" agree and 1 "virtual" disagree
# to each comment's vote count — equivalent to a Beta(2,2) prior
# - This pulls probabilities toward 0.5, with the effect diminishing as n grows
# - Formula: p_agree = (n_agree + PSEUDO_COUNT/2) / (n_votes + PSEUDO_COUNT)
# i.e. (n_agree + 1) / (n_votes + 2)
Comment on lines 18 to 21
# Constants
Z_90 = 1.645 # Z-score for 90% confidence
PSEUDO_COUNT = 1.5 # Pseudocount for Bayesian smoothing
PSEUDO_COUNT = 2.0 # Pseudocount for Bayesian smoothing (Beta(2,2) prior, matches Clojure)

Comment thread delphi/tests/test_discrepancy_fixes.py
## Summary


- Change `PSEUDO_COUNT` from 1.5 to 2.0, matching Clojure's Beta(2,2) prior
- This changes probability smoothing from `pa = (na + 0.75)/(ns + 1.5)` to `pa = (na + 1)/(ns + 2)`
- All `pa`/`pd` values now match Clojure's `p-success` exactly (verified on all datasets with Clojure blobs)

## Changes

- `repness.py`: `PSEUDO_COUNT = 2.0` with updated comment
- `test_discrepancy_fixes.py`: remove xfail from 3 D4 tests (constant check, pa values per dataset, synthetic)
- `test_repness_unit.py`, `test_old_format_repness.py`: import `PSEUDO_COUNT` instead of hardcoding 1.5
- `simplified_repness_test.py`: update hardcoded constant
- Golden snapshots re-recorded for public datasets (vw, biodiversity)

## Test plan

- [x] TDD red: 6 D4 tests fail before fix
- [x] TDD green: all 6 D4 tests pass after fix
- [x] Full public suite: 258 passed, 0 failures
- [x] Private datasets (--include-local): 60 passed, 0 failures (discrepancy tests)
- [x] Regression tests pass on public + FLI + bg2018

🤖 Generated with [Claude Code](https://claude.com/claude-code)


## Squashed commits

- Fix D4: PSEUDO_COUNT 1.5 → 2.0 to match Clojure's Beta(2,2) prior
- Journal: add session 6 (D4 fix), update plan marking D4 done

commit-id:6ae3ee43
@jucor jucor changed the title [Stack 7/17] Fix D4: pseudocount formula Fix D4: pseudocount formula May 19, 2026
@jucor jucor force-pushed the spr/edge/c0a682ec branch from 7f20a34 to 9fbda43 Compare May 19, 2026 22:09
@jucor jucor force-pushed the spr/edge/6ae3ee43 branch from b9dcc89 to 189d192 Compare May 19, 2026 22:09
@github-actions
Copy link
Copy Markdown

Delphi Coverage Report

File Stmts Miss Cover
init.py 2 0 100%
benchmarks/bench_pca.py 76 76 0%
benchmarks/bench_repness.py 81 81 0%
benchmarks/bench_update_votes.py 38 38 0%
benchmarks/benchmark_utils.py 34 34 0%
components/init.py 1 0 100%
components/config.py 165 133 19%
conversation/init.py 2 0 100%
conversation/conversation.py 1117 328 71%
conversation/manager.py 131 42 68%
database/init.py 1 0 100%
database/dynamodb.py 387 234 40%
database/postgres.py 305 205 33%
pca_kmeans_rep/init.py 5 0 100%
pca_kmeans_rep/clusters.py 257 22 91%
pca_kmeans_rep/corr.py 98 17 83%
pca_kmeans_rep/pca.py 52 16 69%
pca_kmeans_rep/repness.py 361 51 86%
pca_kmeans_rep/stats.py 107 22 79%
regression/init.py 4 0 100%
regression/clojure_comparer.py 188 17 91%
regression/comparer.py 887 720 19%
regression/datasets.py 135 27 80%
regression/recorder.py 36 27 25%
regression/utils.py 137 118 14%
run_math_pipeline.py 260 114 56%
umap_narrative/500_generate_embedding_umap_cluster.py 210 109 48%
umap_narrative/501_calculate_comment_extremity.py 112 54 52%
umap_narrative/502_calculate_priorities.py 135 135 0%
umap_narrative/700_datamapplot_for_layer.py 502 502 0%
umap_narrative/701_static_datamapplot_for_layer.py 310 310 0%
umap_narrative/702_consensus_divisive_datamapplot.py 432 432 0%
umap_narrative/801_narrative_report_batch.py 785 785 0%
umap_narrative/802_process_batch_results.py 265 265 0%
umap_narrative/803_check_batch_status.py 175 175 0%
umap_narrative/llm_factory_constructor/init.py 2 2 0%
umap_narrative/llm_factory_constructor/model_provider.py 157 157 0%
umap_narrative/polismath_commentgraph/init.py 1 0 100%
umap_narrative/polismath_commentgraph/cli.py 270 270 0%
umap_narrative/polismath_commentgraph/core/init.py 3 3 0%
umap_narrative/polismath_commentgraph/core/clustering.py 108 108 0%
umap_narrative/polismath_commentgraph/core/embedding.py 104 104 0%
umap_narrative/polismath_commentgraph/lambda_handler.py 219 219 0%
umap_narrative/polismath_commentgraph/schemas/init.py 2 0 100%
umap_narrative/polismath_commentgraph/schemas/dynamo_models.py 160 9 94%
umap_narrative/polismath_commentgraph/tests/conftest.py 17 17 0%
umap_narrative/polismath_commentgraph/tests/test_clustering.py 74 74 0%
umap_narrative/polismath_commentgraph/tests/test_embedding.py 55 55 0%
umap_narrative/polismath_commentgraph/tests/test_storage.py 87 87 0%
umap_narrative/polismath_commentgraph/utils/init.py 3 0 100%
umap_narrative/polismath_commentgraph/utils/converter.py 283 237 16%
umap_narrative/polismath_commentgraph/utils/group_data.py 354 336 5%
umap_narrative/polismath_commentgraph/utils/storage.py 584 477 18%
umap_narrative/reset_conversation.py 159 50 69%
umap_narrative/run_pipeline.py 453 312 31%
utils/general.py 62 41 34%
Total 10950 7647 30%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants