Default stageOutMode to 'copy' for Google Batch executor#6917
Open
rhassaine wants to merge 1 commit intonextflow-io:masterfrom
Open
Default stageOutMode to 'copy' for Google Batch executor#6917rhassaine wants to merge 1 commit intonextflow-io:masterfrom
rhassaine wants to merge 1 commit intonextflow-io:masterfrom
Conversation
✅ Deploy Preview for nextflow-docs-staging canceled.
|
On Google Batch, task outputs are unstaged from local scratch to a gcsfuse-mounted work directory, which is always a cross-device operation. The default 'move' mode uses 'mv' which fails in two scenarios: - When output declarations include both a directory and files inside it, the directory is moved first (with all contents), causing subsequent file moves to fail with 'No such file or directory' - When staged input files are symlinks pointing back to the work directory, 'mv' detects source and target as the same file Using 'copy' mode avoids both issues at no additional I/O cost since cross-device 'mv' already performs a copy internally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: rhassaine <r.hassaine@hartwigmedicalfoundation.nl>
6a4a212 to
f03f527
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
stageOutModetocopyfor Google Batch tasks when not explicitly set by the userProblem
On Google Batch, task outputs are unstaged from local scratch (local SSD) to a gcsfuse-mounted work directory, which is always a cross-device operation. The default
movemode usesmvwhich fails in two scenarios:Overlapping output declarations: When a process declares both a directory and files inside it (e.g.
path("outdir/")andpath("outdir/*.txt")),mvmoves the directory first (with all contents), causing subsequent file moves to fail withNo such file or directorySymlinked inputs: When staged input files are symlinks pointing back to the work directory,
mvdetects source and target as the same file and fails with'X' and 'Y' are the same fileFix
After
super(bean)inGoogleBatchScriptLauncher, defaultstageoutModetocopyon theSimpleFileCopyStrategyif the user hasn't explicitly set astageOutMode. This usescp -fRLinstead ofmv -ffor unstaging, which handles both failure scenarios.Test plan
path("outdir/")+path("outdir/*.txt")) — both tasks succeededstageOutModein config still takes precedence (the fix only applies whenbean.stageOutModeis not set)🤖 Generated with Claude Code