Skip to content

SLURM partition not detected from config #801

@gcapes

Description

@gcapes

Trying to run this simple workflow on CSF3 using slurm,

template_components:
  task_schemas:
  - objective: locate_matlab
    actions:
    - commands:
      - command: which matlab
      environments:
      - scope:
          type: any
        environment: matlab_env  

tasks:
- schema: locate_matlab

resources:
  any:
    scheduler_args:
      options:
        --time: 00:30:00

I came up against this error:

sbatch: error: Batch job submission failed: Invalid partition name specified
Traceback (most recent call last):
  File "/net/scratch/mbexegc2/matflow-demo-workflows/.venv/bin/matflow", line 8, in <module>
    sys.exit(cli())
  File "/net/scratch/mbexegc2/matflow-demo-workflows/.venv/lib64/python3.9/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
  File "/net/scratch/mbexegc2/matflow-demo-workflows/.venv/lib64/python3.9/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
  File "/net/scratch/mbexegc2/matflow-demo-workflows/.venv/lib64/python3.9/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/net/scratch/mbexegc2/matflow-demo-workflows/.venv/lib64/python3.9/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/net/scratch/mbexegc2/matflow-demo-workflows/.venv/lib64/python3.9/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
  File "/net/scratch/mbexegc2/matflow-demo-workflows/.venv/lib64/python3.9/site-packages/hpcflow/sdk/cli.py", line 253, in make_and_submit_workflow
    out = app.make_and_submit_workflow(
  File "/net/scratch/mbexegc2/matflow-demo-workflows/.venv/lib64/python3.9/site-packages/hpcflow/sdk/app.py", line 1644, in <lambda>
    return lambda *args, **kwargs: func(*args, **kwargs)
  File "/net/scratch/mbexegc2/matflow-demo-workflows/.venv/lib64/python3.9/site-packages/hpcflow/sdk/app.py", line 2897, in _make_and_submit_workflow
    submitted_js = wk.submit(
  File "/net/scratch/mbexegc2/matflow-demo-workflows/.venv/lib64/python3.9/site-packages/hpcflow/sdk/core/workflow.py", line 3336, in submit
    raise WorkflowSubmissionFailure(exceptions)
hpcflow.sdk.core.errors.WorkflowSubmissionFailure: 
Some jobscripts in submission index 0 could not be submitted:
Jobscript 0 at path: '/net/scratch/mbexegc2/matflow-demo-workflows/wheresmatlab_2025-04-17_103905/artifacts/submissions/0/jobscripts/js_0.sh'
Submit command: ['sbatch', '--parsable', '/net/scratch/mbexegc2/matflow-demo-workflows/wheresmatlab_2025-04-17_103905/artifacts/submissions/0/jobscripts/js_0.sh'].
Reason: 'Non-empty stderr from submit command.'
Submission stderr:
  sbatch: error: Batch job submission failed: Invalid partition name specified

This was side-stepped by specifying the partition in the jobscript as shown below:

template_components:
  task_schemas:
  - objective: locate_matlab
    actions:
    - commands:
      - command: which matlab
      environments:
      - scope:
          type: any
        environment: matlab_env  

tasks:
- schema: locate_matlab

resources:
  any:
    scheduler_args:
      options:
        --time: 00:30:00
        --partition: serial

Metadata

Metadata

Assignees

Labels

SLURMRelated to the SLURM scheduler integrationbugSomething isn't working

Type

Projects

Status

🔲 Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions