Skip to content

docker.registry needs to override process.container when a registry is used in the container URI #6150

Open
@stevekm

Description

@stevekm

When you define your Nextflow process with a process.container that includes a container registry in the URI, Nextflow ignores the docker.registry setting and uses the registry listed in the process.container.

Example;

https://github.com/nf-core/scdownstream/blob/1512d6ff5e3f2484c36b1bd1d53a0f435a7b3ae4/modules/nf-core/cellbender/merge/main.nf#L8

process CELLBENDER_MERGE {
    tag "$meta.id"
    label 'process_single'

    conda "${moduleDir}/environment.yml"
    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
        'oras://community.wave.seqera.io/library/cellbender:0.3.0--c4addb97ab2d83fe':
        'community.wave.seqera.io/library/cellbender:0.3.0--41318a055fc3aacb' }"

Here the process.container is community.wave.seqera.io/library/cellbender:0.3.0--41318a055fc3aacb where community.wave.seqera.io is the name of the Docker container registry.

However, when we set docker.registry = 12345678900.dkr.ecr.us-east-1.aws.com etc to point to our internal AWS ECR registry, expecting that all containers will be pulled from our internal registry, and that a missing container will error-out (prompting us to review and import the container to our ECR). This is not the case, instead Nextflow ignores docker.registry and pulls the container from the public Wave registry instead.

This is important to meet Infosec compliance requirements in environments that require all containers to be internally hosted.

Similarly, this impacted our CI / CD here as well seqeralabs/nf-canary#32

I was aware that nf-core was moving towards usage of Wave containers, which we had prepared for with our ECR import methodologies, however, I was surprised to see that now nf-core pipelines appear to be using this syntax of hard-coded container registry in the pipeline which breaks the usage of docker.registry which we relied on to ensure we are getting our internal version of the container. @ewels

It seems like the easiest solution would be for Nextflow to allow docker.registry to actually control the Docker registry used, and not get ignored when process.container includes a registry in the URI. If needed I can post some code snippets that show some methods used to programmatically parse the container URI to detect if a registry is used in the URI name (and then you would be able to strip it easier from the container used)

The inability to control where your container is getting pulled from due to docker.registry getting ignored seems like a behavioral oversight of some sort, maybe. What do you think? Thanks.

Environment

  • Nextflow version: 24.10.4
  • Operating system: Linux AWS

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions