[Stack 19/27] Fix D8: match Clojure repful classification (rat > rdt) by jucor · Pull Request #2451 · compdemocracy/polis

jucor · 2026-03-16T18:25:51Z

Summary

Stacked on #2450 (Fix D7: match Clojure repness metric formula (product of 4 signed values)). Please review and merge #2450 first.
Next in stack: #2452 (Fix D15: match Clojure moderation handling (zero out columns, don't remove))

Simplifies the repful ("representative for agree or disagree?") classification
to match Clojure's finalize-cmt-stats (repness.clj:175-177).

Before (Python): 3-branch conditional:

pa > 0.5 AND ra > 1.0 → agree
pd > 0.5 AND rd > 1.0 → disagree
Fallback: whichever metric is higher

After (Clojure): rat > rdt → agree, else disagree.

The old thresholds were redundant — rat and rdt (two-proportion z-scores)
already encode whether the group's agree/disagree rate is significantly higher
than other groups. The simple comparison is both correct and clearer.

Changes

repness.py: finalize_cmt_stats() — 3-branch logic → rat > rdt
repness.py: Vectorized — np.select with conditions → np.where(rat > rdt)
test_discrepancy_fixes.py: Expanded from 2 to 6 tests (including edge cases:
equal rat/rdt, both negative, both zero)
Golden snapshots re-recorded (repful direction changes for some comments)

Test plan

6 targeted D8 tests pass (rat>rdt, rat<rdt, equal, both negative, both zero, old-vs-new divergence case)
Full test suite passes (excluding DynamoDB/MinIO tests)
Private dataset tests pass (--include-local)
Golden snapshots re-recorded for all 7 datasets
19/19 regression tests pass

🤖 Generated with Claude Code

Copilot

Pull request overview

This PR simplifies the repful ("representative for agree or disagree") classification in finalize_cmt_stats to match Clojure's repness.clj:175-177 logic. The old 3-branch conditional (pa > 0.5 AND ra > 1.0 → agree, pd > 0.5 AND rd > 1.0 → disagree, fallback to higher metric) is replaced with the simpler rat > rdt → agree, else disagree.

Changes:

Replaced both scalar (finalize_cmt_stats) and vectorized (compute_group_comment_stats_df) repful classification with rat > rdt comparison
Expanded D8 tests from 2 to 6 formula tests (including edge cases: equal, both negative, both zero), removed xfail markers for now-passing tests
Re-recorded golden snapshots for affected datasets to reflect repful direction changes

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
delphi/polismath/pca_kmeans_rep/repness.py	Simplified repful classification in both scalar and vectorized paths to `rat > rdt`
delphi/tests/test_discrepancy_fixes.py	Added 4 new edge-case tests, removed xfail from D8 formula tests, updated xfail reason on blob test
delphi/docs/CLJ-PARITY-FIXES-JOURNAL.md	Added PR 7 / Session 10 journal entry documenting the D8 fix
delphi/docs/PLAN_DISCREPANCY_FIXES.md	Marked D8 as DONE
delphi/real_data/r6vbnhffkxbd7ifmfbdrd-vw/golden_snapshot.json	Re-recorded snapshot with updated repness values and repful directions
delphi/real_data/r4tykwac8thvzv35jrn53-biodiversity/golden_snapshot.json	Re-recorded snapshot with updated repness values

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

    """
    D8: Python uses if pa > 0.5 AND ra > 1.0 → 'agree'; elif pd > 0.5 AND rd > 1.0 → 'disagree'
-        Clojure uses simple rat > rdt → 'agree'; else → 'disagree'
+        Clojure uses simple rat > rdt → 'agree'; else → 'disagree' (repness.clj:175-177)


github-actions · 2026-03-18T19:17:42Z

Delphi Coverage Report

File	Stmts	Miss	Cover
init.py	3	0	100%
main.py	55	55	0%
benchmarks/bench_pca.py	76	76	0%
benchmarks/bench_repness.py	81	81	0%
benchmarks/bench_update_votes.py	38	38	0%
benchmarks/benchmark_utils.py	34	34	0%
components/init.py	2	0	100%
components/config.py	165	133	19%
components/server.py	116	72	38%
conversation/init.py	2	0	100%
conversation/conversation.py	1108	320	71%
conversation/manager.py	131	42	68%
database/init.py	1	0	100%
database/dynamodb.py	387	234	40%
database/postgres.py	306	205	33%
pca_kmeans_rep/init.py	5	0	100%
pca_kmeans_rep/clusters.py	265	22	92%
pca_kmeans_rep/corr.py	98	17	83%
pca_kmeans_rep/pca.py	50	15	70%
pca_kmeans_rep/repness.py	305	35	89%
poller.py	224	188	16%
regression/init.py	5	0	100%
regression/clojure_comparer.py	182	83	54%
regression/comparer.py	887	473	47%
regression/datasets.py	135	27	80%
regression/recorder.py	36	27	25%
regression/utils.py	138	52	62%
run_math_pipeline.py	260	114	56%
system.py	85	55	35%
umap_narrative/500_generate_embedding_umap_cluster.py	210	109	48%
umap_narrative/501_calculate_comment_extremity.py	112	54	52%
umap_narrative/502_calculate_priorities.py	135	135	0%
umap_narrative/700_datamapplot_for_layer.py	502	502	0%
umap_narrative/701_static_datamapplot_for_layer.py	310	310	0%
umap_narrative/702_consensus_divisive_datamapplot.py	432	432	0%
umap_narrative/801_narrative_report_batch.py	787	787	0%
umap_narrative/802_process_batch_results.py	265	265	0%
umap_narrative/803_check_batch_status.py	175	175	0%
umap_narrative/llm_factory_constructor/init.py	2	2	0%
umap_narrative/llm_factory_constructor/model_provider.py	157	157	0%
umap_narrative/polismath_commentgraph/init.py	1	0	100%
umap_narrative/polismath_commentgraph/cli.py	270	270	0%
umap_narrative/polismath_commentgraph/core/init.py	3	3	0%
umap_narrative/polismath_commentgraph/core/clustering.py	110	110	0%
umap_narrative/polismath_commentgraph/core/embedding.py	104	104	0%
umap_narrative/polismath_commentgraph/lambda_handler.py	219	219	0%
umap_narrative/polismath_commentgraph/schemas/init.py	2	0	100%
umap_narrative/polismath_commentgraph/schemas/dynamo_models.py	160	9	94%
umap_narrative/polismath_commentgraph/tests/conftest.py	17	17	0%
umap_narrative/polismath_commentgraph/tests/test_clustering.py	74	74	0%
umap_narrative/polismath_commentgraph/tests/test_embedding.py	55	55	0%
umap_narrative/polismath_commentgraph/tests/test_storage.py	87	87	0%
umap_narrative/polismath_commentgraph/utils/init.py	3	0	100%
umap_narrative/polismath_commentgraph/utils/converter.py	283	237	16%
umap_narrative/polismath_commentgraph/utils/group_data.py	354	336	5%
umap_narrative/polismath_commentgraph/utils/storage.py	585	477	18%
umap_narrative/reset_conversation.py	159	50	69%
umap_narrative/run_pipeline.py	453	312	31%
utils/general.py	63	41	35%
Total	11269	7727	31%

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-30T16:30:19Z

    "n_participants_in_csv": 69,
    "fixed_timestamp": 1700000000000,
-    "recorded_at": "2026-03-27T01:51:24.692321"
+    "recorded_at": "2026-03-27T01:51:39.156540"
  },


This snapshot update appears to only change recorded_at (and also math_tick elsewhere), which are timestamp-based and not used for regression comparisons (the comparer ignores math_tick and doesn’t compare metadata). If there are no substantive stage-output diffs, consider reverting to avoid noisy churn or making the recorder write stable values for these fields.

Copilot · 2026-03-30T16:30:20Z

    "n_participants_in_csv": 536,
    "fixed_timestamp": 1700000000000,
-    "recorded_at": "2026-03-27T01:51:23.001794"
+    "recorded_at": "2026-03-27T01:51:37.771402"
  },


This snapshot change is only updating recorded_at (and math_tick elsewhere), which are timestamp-based and not part of regression comparisons (math_tick is ignored and metadata isn’t compared). If there are no actual stage-output changes for this dataset, please revert these timestamp-only edits to keep diffs meaningful.

Copilot · 2026-03-30T16:30:20Z

+- `test_discrepancy_fixes.py`: Expanded `TestD8FinalizeStats` from 2 to 7 tests (5 formula +
+  1 blob xfail + edge cases for equal/negative/zero rat/rdt)


The journal entry says TestD8FinalizeStats expanded “from 2 to 7 tests”, but the PR description states 6 tests. Please reconcile the counts (and ideally list the exact test names) so the journal accurately reflects the change set.

Suggested change

- `test_discrepancy_fixes.py`: Expanded `TestD8FinalizeStats` from 2 to 7 tests (5 formula +

1 blob xfail + edge cases for equal/negative/zero rat/rdt)

- `test_discrepancy_fixes.py`: Expanded `TestD8FinalizeStats` to 6 tests covering the formula,

the blob xfail, and edge cases for equal/negative/zero `rat`/`rdt` values

Documents D5-D8 review findings, blob injection tests, CI fixes, k-divergence discovery, stack reordering, and next steps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jucor · 2026-03-30T22:54:46Z

Superseded by spr-managed PR stack. See the new stack starting at #2508.

jucor requested a review from Copilot March 16, 2026 18:25

Copilot started reviewing on behalf of jucor March 16, 2026 18:26 View session

jucor changed the title ~~Fix D8: match Clojure repful classification (rat > rdt)~~ [Stack 17/17] Fix D8: match Clojure repful classification (rat > rdt) Mar 16, 2026

Copilot AI reviewed Mar 16, 2026

View reviewed changes

jucor marked this pull request as draft March 17, 2026 10:35

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from 1d18f1b to 45d6c60 Compare March 17, 2026 16:10

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from db0fe14 to 2a625de Compare March 17, 2026 16:10

jucor changed the title ~~[Stack 17/17] Fix D8: match Clojure repful classification (rat > rdt)~~ [Stack 17/24] Fix D8: match Clojure repful classification (rat > rdt) Mar 17, 2026

jucor changed the title ~~[Stack 17/24] Fix D8: match Clojure repful classification (rat > rdt)~~ [Stack 17/25] Fix D8: match Clojure repful classification (rat > rdt) Mar 17, 2026

This was referenced Mar 17, 2026

[Stack 18/27] Fix D7: match Clojure repness metric formula (product of 4 signed values) #2450

Closed

[Stack 20/27] Fix D15: match Clojure moderation handling (zero out columns, don't remove) #2452

Closed

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from 45d6c60 to 5b57f8a Compare March 18, 2026 18:50

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from 39fc7ef to c0684d4 Compare March 18, 2026 19:02

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from 5b57f8a to 7705349 Compare March 18, 2026 19:06

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from c0684d4 to 42d9f25 Compare March 18, 2026 19:07

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from 42d9f25 to b68c7e5 Compare March 19, 2026 10:23

jucor force-pushed the jc/clj-parity-d7-repness-metric branch 2 times, most recently from 1a0f157 to a8428d5 Compare March 19, 2026 10:46

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from b68c7e5 to c2e521d Compare March 19, 2026 10:46

jucor changed the title ~~[Stack 17/25] Fix D8: match Clojure repful classification (rat > rdt)~~ [Stack 16/24] Fix D8: match Clojure repful classification (rat > rdt) Mar 19, 2026

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from a8428d5 to d9ed377 Compare March 19, 2026 12:32

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from c2e521d to b8a8e08 Compare March 19, 2026 12:32

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from d9ed377 to 9f20b50 Compare March 19, 2026 14:52

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from b8a8e08 to 0154ce7 Compare March 19, 2026 14:52

jucor changed the title ~~[Stack 16/24] Fix D8: match Clojure repful classification (rat > rdt)~~ [Stack 17/25] Fix D8: match Clojure repful classification (rat > rdt) Mar 19, 2026

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from 9f20b50 to 65f136d Compare March 23, 2026 15:33

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from 0154ce7 to a313f1c Compare March 23, 2026 15:33

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from 65f136d to a0d8710 Compare March 23, 2026 15:41

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from a313f1c to 6a2ac55 Compare March 23, 2026 15:41

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from 0297ca2 to c0f1f0f Compare March 24, 2026 10:28

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from 7c92111 to f2c2965 Compare March 24, 2026 11:13

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch 2 times, most recently from 4ebd5ab to baceacd Compare March 24, 2026 11:45

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from f2c2965 to e1392d1 Compare March 26, 2026 21:24

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch 2 times, most recently from 74b31de to c4f5811 Compare March 27, 2026 01:15

jucor force-pushed the jc/clj-parity-d7-repness-metric branch 2 times, most recently from 799a9c4 to 9a1b3b3 Compare March 27, 2026 01:53

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from c4f5811 to 2960412 Compare March 27, 2026 01:53

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from 9a1b3b3 to cee0f53 Compare March 27, 2026 02:10

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch 2 times, most recently from 44b04ae to b24d69b Compare March 27, 2026 10:41

jucor changed the title ~~[Stack 17/25] Fix D8: match Clojure repful classification (rat > rdt)~~ [Stack 18/26] Fix D8: match Clojure repful classification (rat > rdt) Mar 30, 2026

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from 3511e00 to e209f37 Compare March 30, 2026 12:48

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from b24d69b to abfbacb Compare March 30, 2026 12:48

jucor changed the title ~~[Stack 18/26] Fix D8: match Clojure repful classification (rat > rdt)~~ [Stack 19/27] Fix D8: match Clojure repful classification (rat > rdt) Mar 30, 2026

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from e209f37 to 4df6e36 Compare March 30, 2026 12:54

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from abfbacb to 3847f76 Compare March 30, 2026 12:54

jucor requested a review from Copilot March 30, 2026 16:25

Copilot started reviewing on behalf of jucor March 30, 2026 16:26 View session

Copilot AI reviewed Mar 30, 2026

View reviewed changes

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from 3847f76 to f7062f8 Compare March 30, 2026 16:49

jucor force-pushed the jc/clj-parity-d7-repness-metric branch from 4df6e36 to 9bb9604 Compare March 30, 2026 16:49

Journal: add review session entry (2026-03-17)

f7329b4

Documents D5-D8 review findings, blob injection tests, CI fixes, k-divergence discovery, stack reordering, and next steps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jucor force-pushed the jc/clj-parity-d8-finalize-stats branch from f7062f8 to f7329b4 Compare March 30, 2026 17:05

This was referenced Mar 30, 2026

IGNORE -- crash from spr #2503

Closed

IGNORE -- crash from spr #2505

Closed

jucor closed this Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Stack 19/27] Fix D8: match Clojure repful classification (rat > rdt)#2451

[Stack 19/27] Fix D8: match Clojure repful classification (rat > rdt)#2451
jucor wants to merge 1 commit into
jc/clj-parity-d7-repness-metricfrom
jc/clj-parity-d8-finalize-stats

jucor commented Mar 16, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented Mar 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

jucor commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		- `test_discrepancy_fixes.py`: Expanded `TestD8FinalizeStats` from 2 to 7 tests (5 formula +
		1 blob xfail + edge cases for equal/negative/zero rat/rdt)

Conversation

jucor commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions Bot commented Mar 18, 2026

Delphi Coverage Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

jucor commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jucor commented Mar 16, 2026 •

edited

Loading