Description
When you define your Nextflow process with a process.container
that includes a container registry in the URI, Nextflow ignores the docker.registry
setting and uses the registry listed in the process.container
.
Example;
process CELLBENDER_MERGE {
tag "$meta.id"
label 'process_single'
conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'oras://community.wave.seqera.io/library/cellbender:0.3.0--c4addb97ab2d83fe':
'community.wave.seqera.io/library/cellbender:0.3.0--41318a055fc3aacb' }"
Here the process.container
is community.wave.seqera.io/library/cellbender:0.3.0--41318a055fc3aacb
where community.wave.seqera.io
is the name of the Docker container registry.
However, when we set docker.registry = 12345678900.dkr.ecr.us-east-1.aws.com
etc to point to our internal AWS ECR registry, expecting that all containers will be pulled from our internal registry, and that a missing container will error-out (prompting us to review and import the container to our ECR). This is not the case, instead Nextflow ignores docker.registry
and pulls the container from the public Wave registry instead.
This is important to meet Infosec compliance requirements in environments that require all containers to be internally hosted.
Similarly, this impacted our CI / CD here as well seqeralabs/nf-canary#32
I was aware that nf-core was moving towards usage of Wave containers, which we had prepared for with our ECR import methodologies, however, I was surprised to see that now nf-core pipelines appear to be using this syntax of hard-coded container registry in the pipeline which breaks the usage of docker.registry
which we relied on to ensure we are getting our internal version of the container. @ewels
It seems like the easiest solution would be for Nextflow to allow docker.registry
to actually control the Docker registry used, and not get ignored when process.container
includes a registry in the URI. If needed I can post some code snippets that show some methods used to programmatically parse the container URI to detect if a registry is used in the URI name (and then you would be able to strip it easier from the container used)
The inability to control where your container is getting pulled from due to docker.registry
getting ignored seems like a behavioral oversight of some sort, maybe. What do you think? Thanks.
Environment
- Nextflow version: 24.10.4
- Operating system: Linux AWS