Skip to content

Deepvariant requires additional cli option for recalibrated BAM files #682

@marchoeppner

Description

@marchoeppner

Description of the bug

Currently, Sarek produces BAM files for variant calling through the GATK best-practices, including recalibration (default).

For DeepVariant, this is discouraged, as stated by the developers:

https://github.com/google/deepvariant/blob/r1.4/docs/trio-merge-case-study.md

It is recommended to use BAM files with original quality scores. In the case that BAM files went through recalibration, optional DV flags can be used in order to use original scores: --parse_sam_aux_fields, --use_original_quality_scores.

While recalibration can be disabled globally within Sarek, this would potentially have a negative impact on e.g. the GATK subworkflow, where recalibration is still considered to be useful (not sure it really is...).

Solution:

  1. Easy: Have an optional parameter in the Deepvariant module that sets the above flags if recalibration was performed
  2. Slighty less easy: Emit both deduped and deduped+recalibrated BAM/CRAM files after the GATK BP alignment/deduping and pass them to the appropriate subworkflows, i.e. GATK gets the recal BAM file, all other subworkflows get the non-recalibrated BAM. This might be the better choice in the long-term, depending on what the effect of recalibration is on other callers/tools (largely unexplored, I think).

Command used and terminal output

No response

Relevant files

No response

System information

No response

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

Status

To do

Relationships

None yet

Development

No branches or pull requests

Issue actions