Add split paired output, pairing map model filtering, filter summary report, and fix max-offset filter for asymmetrical model pairs in tirmite search#69
Conversation
…and symmetric pairing Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/15ab8524-c5cf-4a38-9dfd-66aba127ea78 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/15ab8524-c5cf-4a38-9dfd-66aba127ea78 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
|
@copilot If |
Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/1aa80e72-ddea-4d84-b699-db20f0241bd6 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Done in commit |
|
@copilot write a summary report including hits excluded dud to models not in pairing map, and excluded at each filtering step. report total nested hits removed for each model. for each model report names of all other models that had hits nested which were removed. For each model pair report the number of cross-matching hits that were removed. Update mkdocs tutorial, update mermaid diagram workflow, update readme. |
… update docs Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/8ce287d1-1a8f-4fb0-9940-61b366d704e0 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Done in commit
Example output: Also updated |
tirmite searchlacked support for writing left/right model hits to separate files (needed bytirmite pairfor asymmetrical pairs), did not restrict output to models relevant to the pairing map, provided no consolidated filtering report, and the--max-offsetfilter had incorrect logic for same-strand symmetric orientations (F,F / R,R).Split paired output
--split-paired-outputflag writes<prefix>_left_hits.taband<prefix>_right_hits.tabbased on the pairing map columns--pairing-map; warns about hits from models absent from the mapPairing map model filtering
When
--pairing-mapis provided, hits from models not listed in the pairing map are now excluded from all output. This is applied as Step 0 in pairing map processing, before nested hit removal and cross-model overlap filtering. Removed models are reported in the log.Filter summary report
At the end of all pairing map filtering steps, a structured summary report is emitted covering:
(removed → winner)with countsImplemented via a new
SearchFilterSummarydataclass (accumulated across all three filter functions) and alog_filter_summary()function called at the end of_process_hits.Fix
--max-offsetfor same-strand orientationsThe
filter_hits_by_anchor()inensemble_search.pyunconditionally kept all hits when orientation was F,F or R,R without a pairing map. Now checks both ends of the query model, matchinghmm_pair.pylogic:Documentation
tirmite-search.mdwith accurate offset rules, new Step 0 section, filter summary report section (with example output), updated mermaid workflow diagram, and asymmetrical pairing examples in Next StepsREADME.mdalgorithm overview with a new ensemble search sub-section describing the full pipeline and summary reportTests
validate_split_paired_output,write_split_hits,filter_hits_to_pairing_map_models,SearchFilterSummarypopulation in all three filter functions, andlog_filter_summaryoutput