Skip to content

Fix D15: match Clojure moderation handling (zero out columns, don't remove)#2523

Open
jucor wants to merge 1 commit into
spr/edge/9ec28252from
spr/edge/c3450b9a
Open

Fix D15: match Clojure moderation handling (zero out columns, don't remove)#2523
jucor wants to merge 1 commit into
spr/edge/9ec28252from
spr/edge/c3450b9a

Conversation

@jucor
Copy link
Copy Markdown
Collaborator

@jucor jucor commented Mar 30, 2026

Summary

Python's _apply_moderation() removed moderated-out comment columns entirely
from rating_mat. Clojure's zero-out-columns (named_matrix.clj:214-230) sets
all values in moderated columns to 0, preserving matrix structure.

This fix changes Python to match:

  • Moderated-out comment columns are zeroed (values set to 0.0), not removed
  • rating_mat retains the same column count as raw_rating_mat
  • Moderated-out participants (rows) are still removed — unchanged

Why zeroing matters

  • Matrix dimensions: Clojure's rating-mat has the same shape as raw-rating-mat.
    Downstream code (PCA, repness) processes the same-shaped matrix.
  • tids list: Column indices stay stable. Consumers depend on this.
  • Practical impact: Zeroed columns have no signal (na=0, nd=0), so they fail
    significance tests and are excluded from repness/consensus. PCA sees zero variance.

Changes

  • conversation.py: _apply_moderation() — zero out columns instead of removing
  • test_discrepancy_fixes.py: 5 new synthetic tests + 2 enhanced real-data tests
  • test_conversation.py: Updated to expect zeroed columns

Test plan

  • Synthetic tests: zeroing preserves columns, values are 0, non-moderated unchanged
  • Real-data test: biodiversity-incremental (169 mod-out comments)
  • Full public test suite: 328 passed, 0 failed
  • TDD cycle: RED (2 failures) → GREEN (all pass)

🤖 Generated with Claude Code

Squashed commits

  • Fix D15: match Clojure moderation handling (zero out columns, don't remove)

commit-id:c3450b9a


Stack:


⚠️ Part of a stack created by spr. Do not merge manually using the UI - doing so may have unexpected results.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes discrepancy D15 between Python and Clojure: Python's _apply_moderation() previously removed moderated-out comment columns from rating_mat entirely, while Clojure's zero-out-columns zeros them in place. Changing to zeroing preserves matrix structure (same shape as raw_rating_mat), which downstream consumers (tids, PCA, repness) expect.

Changes:

  • _apply_moderation() now removes only moderated-out participant rows and zeros out moderated-out comment columns instead of dropping them.
  • Adds a new TestD15SyntheticModeration class (5 small synthetic tests) plus enhanced real-data D15 tests that apply mod-out from the Clojure blob and verify column counts and zero values.
  • Updates pre-existing tests in test_conversation.py and test_discrepancy_fixes.py to expect zeroed (not removed) moderated columns. Plan/journal docs updated.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
delphi/polismath/conversation/conversation.py Core fix: zero moderated comment columns in _apply_moderation() instead of dropping them.
delphi/tests/test_discrepancy_fixes.py Expands D15 real-data tests and adds 5 synthetic tests; updates D2c assertion to expect equal raw/filtered column counts.
delphi/tests/test_conversation.py Updates moderation tests to assert moderated columns remain present and are zeroed.
delphi/docs/PLAN_DISCREPANCY_FIXES.md Marks D8 and D15 as done; adds K-divergence investigation section.
delphi/docs/CLJ-PARITY-FIXES-JOURNAL.md Adds session notes documenting the D15 fix, rationale, tests, and impact.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…emove)

## Summary


Python's `_apply_moderation()` removed moderated-out comment columns entirely
from `rating_mat`. Clojure's `zero-out-columns` (named_matrix.clj:214-230) sets
all values in moderated columns to 0, preserving matrix structure.

This fix changes Python to match:
- Moderated-out comment columns are **zeroed** (values set to 0.0), not removed
- `rating_mat` retains the same column count as `raw_rating_mat`
- Moderated-out participants (rows) are still removed — unchanged

### Why zeroing matters

- **Matrix dimensions**: Clojure's `rating-mat` has the same shape as `raw-rating-mat`.
  Downstream code (PCA, repness) processes the same-shaped matrix.
- **tids list**: Column indices stay stable. Consumers depend on this.
- **Practical impact**: Zeroed columns have no signal (na=0, nd=0), so they fail
  significance tests and are excluded from repness/consensus. PCA sees zero variance.

## Changes
- `conversation.py`: `_apply_moderation()` — zero out columns instead of removing
- `test_discrepancy_fixes.py`: 5 new synthetic tests + 2 enhanced real-data tests
- `test_conversation.py`: Updated to expect zeroed columns

## Test plan
- [x] Synthetic tests: zeroing preserves columns, values are 0, non-moderated unchanged
- [x] Real-data test: biodiversity-incremental (169 mod-out comments)
- [x] Full public test suite: 328 passed, 0 failed
- [x] TDD cycle: RED (2 failures) → GREEN (all pass)

🤖 Generated with [Claude Code](https://claude.com/claude-code)


## Squashed commits

- Fix D15: match Clojure moderation handling (zero out columns, don't remove)

commit-id:c3450b9a
@jucor jucor changed the title [Stack 16/17] Fix D15: match Clojure moderation handling (zero out columns, don't remove) Fix D15: match Clojure moderation handling (zero out columns, don't remove) May 19, 2026
@jucor jucor force-pushed the spr/edge/9ec28252 branch from 8c7b7f7 to c381171 Compare May 19, 2026 22:09
@jucor jucor force-pushed the spr/edge/c3450b9a branch from 6be5051 to 954dc4a Compare May 19, 2026 22:09
@github-actions
Copy link
Copy Markdown

Delphi Coverage Report

File Stmts Miss Cover
init.py 2 0 100%
benchmarks/bench_pca.py 76 76 0%
benchmarks/bench_repness.py 81 81 0%
benchmarks/bench_update_votes.py 38 38 0%
benchmarks/benchmark_utils.py 34 34 0%
components/init.py 1 0 100%
components/config.py 165 133 19%
conversation/init.py 2 0 100%
conversation/conversation.py 1109 320 71%
conversation/manager.py 131 42 68%
database/init.py 1 0 100%
database/dynamodb.py 387 234 40%
database/postgres.py 305 205 33%
pca_kmeans_rep/init.py 5 0 100%
pca_kmeans_rep/clusters.py 257 22 91%
pca_kmeans_rep/corr.py 98 17 83%
pca_kmeans_rep/pca.py 52 16 69%
pca_kmeans_rep/repness.py 312 34 89%
regression/init.py 4 0 100%
regression/clojure_comparer.py 188 17 91%
regression/comparer.py 887 720 19%
regression/datasets.py 135 27 80%
regression/recorder.py 36 27 25%
regression/utils.py 138 94 32%
run_math_pipeline.py 260 114 56%
umap_narrative/500_generate_embedding_umap_cluster.py 210 109 48%
umap_narrative/501_calculate_comment_extremity.py 112 53 53%
umap_narrative/502_calculate_priorities.py 135 135 0%
umap_narrative/700_datamapplot_for_layer.py 502 502 0%
umap_narrative/701_static_datamapplot_for_layer.py 310 310 0%
umap_narrative/702_consensus_divisive_datamapplot.py 432 432 0%
umap_narrative/801_narrative_report_batch.py 785 785 0%
umap_narrative/802_process_batch_results.py 265 265 0%
umap_narrative/803_check_batch_status.py 175 175 0%
umap_narrative/llm_factory_constructor/init.py 2 2 0%
umap_narrative/llm_factory_constructor/model_provider.py 157 157 0%
umap_narrative/polismath_commentgraph/init.py 1 0 100%
umap_narrative/polismath_commentgraph/cli.py 270 270 0%
umap_narrative/polismath_commentgraph/core/init.py 3 3 0%
umap_narrative/polismath_commentgraph/core/clustering.py 108 108 0%
umap_narrative/polismath_commentgraph/core/embedding.py 104 104 0%
umap_narrative/polismath_commentgraph/lambda_handler.py 219 219 0%
umap_narrative/polismath_commentgraph/schemas/init.py 2 0 100%
umap_narrative/polismath_commentgraph/schemas/dynamo_models.py 160 9 94%
umap_narrative/polismath_commentgraph/tests/conftest.py 17 17 0%
umap_narrative/polismath_commentgraph/tests/test_clustering.py 74 74 0%
umap_narrative/polismath_commentgraph/tests/test_embedding.py 55 55 0%
umap_narrative/polismath_commentgraph/tests/test_storage.py 87 87 0%
umap_narrative/polismath_commentgraph/utils/init.py 3 0 100%
umap_narrative/polismath_commentgraph/utils/converter.py 283 237 16%
umap_narrative/polismath_commentgraph/utils/group_data.py 354 336 5%
umap_narrative/polismath_commentgraph/utils/storage.py 584 518 11%
umap_narrative/reset_conversation.py 159 50 69%
umap_narrative/run_pipeline.py 453 312 31%
utils/general.py 62 41 34%
Total 10787 7616 29%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants