Fix incorrect flank reporting: only external flanks, per-pair output, split paired TIR files, summary reports, pairing-map model filtering, rename --report to --gff-report, rename --gff-out to --gff, add startup logging, sequence counts in output filen...#65
Conversation
… sub-directories, and summary reports - Add --flanks flag to enable writing flanks for all hits - Add --flanks-paired flag (replaces --flank-paired-only) for paired-only flanks - Change --flank-len default to 50 - Handle symmetric FF/RR orientations: write both flanks with warning - Write per-pair target site FASTA files (one per model pair) - Write target sites as single-line non-wrapped FASTA - Create sub-directories per pair when using --pairing-map - Write summary reports for each model pair - Update tests for new API - Update documentation Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/470e9689-0d39-4230-8ada-2b3526f50273 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/470e9689-0d39-4230-8ada-2b3526f50273 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
|
@copilot if input blast or nhmmer results contain hits to multiple models, but the |
When --pairing-map is provided, only hits for models referenced in the pairing map are kept before any downstream processing (coverage calc, e-value filter, writeTIRs, pairing, etc.). Models present in hits but absent from the pairing map are reported with a warning and their hit count, then excluded. If no pairing map is provided, all hits proceed as before. Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/ac004274-bb89-48ec-a78e-e255ec4fa6f4 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Done in commit |
|
For standard One interleaved flanks output incorrectly identifies flanks external to the termini. The second interleave termini output correctly printsFor left and right models in F,F orientation on the positive strand, we want the left flank of the left model, and the right flank of the right model. Output sub-directories per model pair are created, but duplicates of all output files are also written to the base output dir. If a model pair output dir is created then only write relevant output there. Include detailed summary of criteria on which hits were filtered in the summary.txt file |
…nk orientation, enhance summary filter details - writeTIRs now writes to per-pair subdir in pairing_map mode; all post-loop output writes (writePairedTIRs, writeElements, writeFlanks, writeTargetSites, GFF) are guarded with 'if not pairing_map:' to prevent duplicates in base outDir - writeFlanks: symmetric same-strand (F,F/R,R) unpaired hits now skipped (not written) rather than writing both left+right flanks which would include internal flanks; warning updated to advise --flanks-paired - _write_pair_summary gains optional filter_stats dict; summary.txt now includes a 'Filtering criteria applied' section with thresholds and excluded hit counts for pairing-map model filter, coverage filter, e-value filter, and anchor offset filter; filter_stats is collected incrementally in main() and passed to all summary calls Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/75009e5a-f10a-493c-aa7b-f5f00ef0099f Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Addressed in commit
|
…t duplicate output files In writeTargetSites, the pair_key for grouping output records now uses the canonical model order from PairingConfig (config.left_model, config.right_model) instead of the genomic-position-based order from flipTIRs. This ensures only ONE target_sites file and ONE interleaved_flanks file per canonical model pair. In writeFlanks, paired hit flanks are now grouped by config-canonical model names (config.left_model for left flanks, config.right_model for right flanks) for asymmetric pairings, preventing spurious extra flank files when the genomic position order differs from the config's model assignment. Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/b8293eeb-cbc5-4f59-a4f6-6cbcb7a6dcf8 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/b8293eeb-cbc5-4f59-a4f6-6cbcb7a6dcf8 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
|
@copilot paired term hits is currently written as single file with hits to left and right models. Paired hits should only be written to two separate files, one for left hits and one for right. Ensure mkdocs tutorial for |
…mite-pair tutorial Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/02571512-36fb-4831-914a-f2b151a8ec28 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
|
@copilot the |
Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/4e774a09-3166-4989-8a05-1c37ca10b006 Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Done in |
|
@copilot change Add logging summary of all input args, noting anything changed from defaults, when Log check for any required software and report path if found i.e |
…args; check blastdbcmd Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/d105c84e-c21e-4ee4-9042-af6fa1f0857d Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/d105c84e-c21e-4ee4-9042-af6fa1f0857d Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Done in
|
Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/6fff4ee8-7897-479e-be05-625f8010d5aa Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
|
@copilot add sequence count to the end of file names for "interleaved flanks" and for "target sites" |
…ames Agent-Logs-Url: https://github.com/Adamtaranto/TIRmite/sessions/962ab510-3367-4e5d-a487-94168ddef03e Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Done in |
pair_outDirwhen--gffis set — the GFF block was previously insideif not pairing_map:so it was never executed in pairing-map modeinterleaved_flanksandtarget_sitesfilenames (e.g.{prefix}{pair_label}_target_sites_{N}.fasta,{prefix}{pair_label}_interleaved_flanks_{N}.fasta)