Skip to content

Add config source attribution (blame) to nextflow config command #6894

@ewels

Description

@ewels

Summary

Add a new flag to the nextflow config command that shows the source/origin of each configuration value, similar to git blame. This would help users debug config resolution issues by showing where each value was loaded from.

Motivation

Debugging configuration resolution can be difficult, especially when configs come from multiple sources (pipeline config, user config, CLI params, Platform/Seqera configs, etc.). Currently there's no easy way to determine which config file a particular value came from or which source "won" when multiple configs define the same value.

Proposed Solution

Add a --blame (or similar) flag to nextflow config that annotates each config value with its source location.

Example Output Formats

Option 1: Comment-style annotation (works with -flat)

process.memory = { 6.GB * task.attempt }  // /Users/ewels/GitHub/nf-core/rnaseq/conf/test.config
process.time = { 4.h * task.attempt }  // /Users/ewels/GitHub/nf-core/rnaseq/conf/test.config
process.errorStrategy = { task.exitStatus in ((130..145) + 104 + 175) ? 'retry' : 'finish' }  // /Users/ewels/GitHub/nf-core/rnaseq/nextflow.config
process.maxRetries = 1 // /Users/ewels/GitHub/nf-core/rnaseq/nextflow.config
process.'withLabel:process_low'.memory = { 12.GB * task.attempt } // /data/working/launchdir/nextflow.config
process.'withLabel:process_low'.time = { 4.h * task.attempt } // [command line]

Option 2: Git-blame style left sidebar

/Users/ewels/GitHub/nf-core/rnaseq/conf/test.config | process.memory = { 6.GB * task.attempt }
/Users/ewels/GitHub/nf-core/rnaseq/conf/test.config | process.time = { 4.h * task.attempt }
/Users/ewels/GitHub/nf-core/rnaseq/nextflow.config  | process.errorStrategy = { task.exitStatus in ((130..145) + 104 + 175) ? 'retry' : 'finish' }

Implementation Notes

The ConfigBuilder.parse0() method already tracks parsed config files via parsedConfigFiles. The implementation would need to:

  1. Track the source file/origin for each config value as configs are parsed and merged
  2. Preserve this attribution through the merge process (later sources override earlier ones)
  3. Add a new CLI flag to the config command to enable blame output
  4. Format and display the attribution in the output

Additional Considerations

  • Trace logging: Consider also adding trace-level logging in ConfigBuilder to show configs as they're loaded during actual pipeline runs (useful for debugging Platform-injected configs)
  • Platform integration: When configs come from Seqera Platform, ideally the attribution would distinguish between different Platform config sources (Compute Environment, Workspace, etc.). This may require Platform to pass multiple config files or include blame metadata.

Related Discussion

Slack thread: https://seqera.slack.com/archives/CUTDS6JE9/p1772703970085559?thread_ts=1772617083.053159&cid=CUTDS6JE9

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions