Skip to content

Mahti has expanded its capabilities with a new small partition with core-based allocation and fast local NVMe storage! Click here for details.

Why does my batch job fail?

Below are common error messages you may get when the job fails, and advice how to fix them.

Invalid account or account/partition combination specified

The complete error message is as shown below:

sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified

This error message refers to Slurm options --account=<project> and --partition. The most common causes are:

  • Project does not exist.
  • Project exists, but you are not a member of it. See how to add a member to project
  • You are a project member, but the project has not been enabled on Puhti. See how to add service access for project.
  • Partition does not exist.
  • Partition exists, but your project is not enabled in it.

Job violates accounting/QOS policy

The complete error message is as shown below:

sbatch: error: AssocMaxSubmitJobLimit
sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)

The most common causes are:

  • Job script is missing the --account parameter.
  • Your project has too many jobs in the system, either running or queuing. Note that internally, Slurm counts each job within an array job as a separate job.
  • The job was executed directly ./ or bash, while it should be submitted with sbatch
  • Your project has ran out of billing units. See How to apply for more billing units.

Requested node configuration is not available

The complete error message is as shown below:

sbatch: error: Batch job submission failed: Requested node configuration is not available

The most common causes are:

  • Requesting e.g. a GPU or NVMe in a partition that does not have them.
  • Requesting e.g. more memory or time than the chosen partition has to offer. Especially if using the --mem-per-cpu flag to specify memory, note that this will be multiplied by the number or requested CPUs (1 per task by default) and the result must be within the limits of the chosen partition.

See batch job partitions for more details on the resources available in each queue.