-
Notifications
You must be signed in to change notification settings - Fork 490
Open
Description
Description of the bug
Currently, Sarek produces BAM files for variant calling through the GATK best-practices, including recalibration (default).
For DeepVariant, this is discouraged, as stated by the developers:
https://github.com/google/deepvariant/blob/r1.4/docs/trio-merge-case-study.md
It is recommended to use BAM files with original quality scores. In the case that BAM files went through recalibration, optional DV flags can be used in order to use original scores: --parse_sam_aux_fields, --use_original_quality_scores.
While recalibration can be disabled globally within Sarek, this would potentially have a negative impact on e.g. the GATK subworkflow, where recalibration is still considered to be useful (not sure it really is...).
Solution:
- Easy: Have an optional parameter in the Deepvariant module that sets the above flags if recalibration was performed
- Slighty less easy: Emit both deduped and deduped+recalibrated BAM/CRAM files after the GATK BP alignment/deduping and pass them to the appropriate subworkflows, i.e. GATK gets the recal BAM file, all other subworkflows get the non-recalibrated BAM. This might be the better choice in the long-term, depending on what the effect of recalibration is on other callers/tools (largely unexplored, I think).
Command used and terminal output
No response
Relevant files
No response
System information
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
To do