Skip to content

Solve queueSize exceeded when using job arrays #6047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

jorgee
Copy link
Contributor

@jorgee jorgee commented May 7, 2025

close #5920

  • Modifying canSubmit to avoid to avoid exceed executor.queueSize parameter.
  • Printing warning when array size exceeds the queueSize. This can make the task array 'unsubmitable' (Not sure if we should abort the execution in this situation)
  • Add unit tests

Tested with this pipeline

process test {
    array 10 
    input:
    val x

    """
    echo $x
    """
}

workflow {
   Channel.of(1..20) | test
}

With a config with awsbatch executor and different executor.queueSize values:

  • size 15: Only one job submitted at a time in AWS Batch.
  • size 5: Prints a warning but run continues without the task array jobs submitted.

@jorgee jorgee linked an issue May 7, 2025 that may be closed by this pull request
Copy link

netlify bot commented May 7, 2025

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit a944cc8
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/68387c46a957710008a2059a

@jorgee jorgee requested a review from bentsherman May 8, 2025 13:43
@jorgee jorgee changed the title 5920 job array with queuesize exceeds queuesize Solve queueSize exceeded when using job arrays May 9, 2025
Signed-off-by: Ben Sherman <[email protected]>
@bentsherman
Copy link
Member

  • change the warning to an error, since the run will hang anyway
  • make process maxForks aware of job arrays

@@ -266,7 +273,7 @@ abstract class TaskHandler {
*/
boolean canForkProcess() {
final max = task.processor.maxForks
return !max ? true : task.processor.forksCount < max
return !max ? true : task.processor.forksCount + getForksCount() <= max
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be task.processor.forksCount * getForksCount(), I mean multiplied?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would it be multiplied?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I was totally confused by the getForksCount naming. This new attribute should be named differently because it overlaps with processor.forksCount that has a complete different meaning. Something like getRunsCount and getArraySize would be much better

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are they different in meaning?

Signed-off-by: Ben Sherman <[email protected]>
@@ -266,7 +273,7 @@ abstract class TaskHandler {
*/
boolean canForkProcess() {
final max = task.processor.maxForks
return !max ? true : task.processor.forksCount < max
return !max ? true : task.processor.forksCount + getForksCount() <= max
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I was totally confused by the getForksCount naming. This new attribute should be named differently because it overlaps with processor.forksCount that has a complete different meaning. Something like getRunsCount and getArraySize would be much better

@pditommaso
Copy link
Member

Because processor.forksCount is related to maxForks directive, instead here the aim is to controller the max numbers of task runs

@bentsherman
Copy link
Member

So "forks" here just refers to the number of concurrent tasks, therefore I see no reason to call it something else in the task handler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Job Array with queueSize exceeds queueSize
3 participants