Skip to content

Add split paired output, pairing map model filtering, filter summary report, and fix max-offset filter for asymmetrical model pairs in tirmite search#69

Merged
Adamtaranto merged 5 commits into
mainfrom
copilot/add-tirmite-search-asymmetrical-handling
Apr 20, 2026
Merged

Add split paired output, pairing map model filtering, filter summary report, and fix max-offset filter for asymmetrical model pairs in tirmite search#69
Adamtaranto merged 5 commits into
mainfrom
copilot/add-tirmite-search-asymmetrical-handling

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 20, 2026

tirmite search lacked support for writing left/right model hits to separate files (needed by tirmite pair for asymmetrical pairs), did not restrict output to models relevant to the pairing map, provided no consolidated filtering report, and the --max-offset filter had incorrect logic for same-strand symmetric orientations (F,F / R,R).

Split paired output

  • New --split-paired-output flag writes <prefix>_left_hits.tab and <prefix>_right_hits.tab based on the pairing map columns
  • Validates no model name appears in both left and right columns — raises error if so
  • Requires --pairing-map; warns about hits from models absent from the map

Pairing map model filtering

When --pairing-map is provided, hits from models not listed in the pairing map are now excluded from all output. This is applied as Step 0 in pairing map processing, before nested hit removal and cross-model overlap filtering. Removed models are reported in the log.

Filter summary report

At the end of all pairing map filtering steps, a structured summary report is emitted covering:

  • Step 0 — per-model hit counts excluded because the model is not in the pairing map
  • Step 1 — total nested hits removed per model, listing the container model(s) and per-container counts
  • Step 2 — cross-model overlap hits removed per model pair (removed → winner) with counts

Implemented via a new SearchFilterSummary dataclass (accumulated across all three filter functions) and a log_filter_summary() function called at the end of _process_hits.

Fix --max-offset for same-strand orientations

The filter_hits_by_anchor() in ensemble_search.py unconditionally kept all hits when orientation was F,F or R,R without a pairing map. Now checks both ends of the query model, matching hmm_pair.py logic:

# Before: terminus_type = None → kept unconditionally
# After: check both ends (same as hmm_pair.py)
offset_start = hmm_start - 1
offset_end = model_len - hmm_end
if offset_start <= max_offset and offset_end <= max_offset:
    kept.append(True)

Documentation

  • Updated tirmite-search.md with accurate offset rules, new Step 0 section, filter summary report section (with example output), updated mermaid workflow diagram, and asymmetrical pairing examples in Next Steps
  • Updated README.md algorithm overview with a new ensemble search sub-section describing the full pipeline and summary report

Tests

  • Updated 3 existing tests to reflect corrected same-strand behavior
  • Added tests covering validate_split_paired_output, write_split_hits, filter_hits_to_pairing_map_models, SearchFilterSummary population in all three filter functions, and log_filter_summary output

Copilot AI and others added 2 commits April 20, 2026 05:21
…and symmetric pairing

Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/15ab8524-c5cf-4a38-9dfd-66aba127ea78

Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Copilot AI changed the title [WIP] Add support for asymmetrical model pairs in tirmite search module Add split paired output and fix max-offset filter for asymmetrical model pairs in tirmite search Apr 20, 2026
Copilot AI requested a review from Adamtaranto April 20, 2026 05:23
@Adamtaranto
Copy link
Copy Markdown
Owner

@copilot If --pairing-map is provided only process and report hits belonging to models in the pairing map.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 20, 2026

@copilot If --pairing-map is provided only process and report hits belonging to models in the pairing map.

Done in commit 7e2a7d0. Added filter_hits_to_pairing_map_models() which restricts the hit table to only models listed in the pairing map (as left or right). It's applied as Step 0 in the pairing map processing block inside _process_hits, before nested hit removal and cross-model overlap filtering. Hits from models outside the pairing map are logged at INFO level and excluded from all output files.

Copilot AI changed the title Add split paired output and fix max-offset filter for asymmetrical model pairs in tirmite search Add split paired output, pairing map model filtering, and fix max-offset filter for asymmetrical model pairs in tirmite search Apr 20, 2026
@Adamtaranto
Copy link
Copy Markdown
Owner

@copilot write a summary report including hits excluded dud to models not in pairing map, and excluded at each filtering step. report total nested hits removed for each model. for each model report names of all other models that had hits nested which were removed. For each model pair report the number of cross-matching hits that were removed.

Update mkdocs tutorial, update mermaid diagram workflow, update readme.

… update docs

Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/8ce287d1-1a8f-4fb0-9940-61b366d704e0

Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 20, 2026

@copilot write a summary report including hits excluded dud to models not in pairing map, and excluded at each filtering step. report total nested hits removed for each model. for each model report na...

Done in commit e0621eb. Added a structured filter summary report that logs at the end of all pairing map processing:

  • Step 0 — per-model hit counts excluded because the model is not in the pairing map
  • Step 1 — total nested hits removed per model, with the container model(s) and per-container counts listed
  • Step 2 — cross-model overlap hits removed per model pair (removed → winner) with counts

Example output:

============================================================
Pairing Map Filter Summary
============================================================
Step 0 — Excluded 12 hit(s) from models not in the pairing map:
  UnrelatedModel: 12 hit(s) excluded
Step 1 — Removed 4 nested hit(s) within direct left/right pairs:
  RightA: 4 hit(s) nested within [LeftA (4)]
Step 2 — Removed 7 cross-model hit(s) at shared loci:
  FAMILY2_LEFT → FAMILY1_LEFT: 7 hit(s) removed
============================================================

Also updated tirmite-search.md (new Step 0 section, summary report section with example output and table, updated mermaid diagram) and README.md (expanded algorithm overview with the ensemble search pipeline and summary report).

Copilot AI changed the title Add split paired output, pairing map model filtering, and fix max-offset filter for asymmetrical model pairs in tirmite search Add split paired output, pairing map model filtering, filter summary report, and fix max-offset filter for asymmetrical model pairs in tirmite search Apr 20, 2026
@Adamtaranto Adamtaranto marked this pull request as ready for review April 20, 2026 07:04
@Adamtaranto Adamtaranto merged commit 05e823a into main Apr 20, 2026
6 checks passed
@Adamtaranto Adamtaranto deleted the copilot/add-tirmite-search-asymmetrical-handling branch April 20, 2026 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: tirmite search module should handle asymmetrical model pairs

2 participants