Job arrays offer a mechanism for submitting and managing collections of similar jobs quickly and easily. Instead of submitting N jobs independently, you can submit one array job unifying N tasks.
A job array can be submitted simply by adding
#SBATCH --array=x-y
to the job script where x and y are the array bounds. A job array can also be specified at the command line with
sbatch --array=x-y job_script.sub
A job array will then be created with a number of independent jobs, known as arrays tasks.
Comma-separated list of task numbers can be provided, if for example a user wants to rerun specific jobs after a previously completed job array.
sbatch --array=4,6,12,16 job_script.sub
A user can limit the number of active array tasks at a time using the %N option
#SBATCH --array=1-100%5
The above will create an array of 200 tasks, running only 5 tasks at a time.
Each sub-job in a job array will have a SLURM_ARRAY_JOB_ID
that includes both the parent SLURM_ARRAY_JOB_ID
and a unique SLURM_ARRAY_TASK_ID
after the character underscore "_".
101300_1
101300_2
101300_3
101300_4
101300_5
To cancel a job array or a sub-job use the scancel
command:
scancel 101300 # will cancel the whole array
scancel 101300_1 # will cancel only the first sub-job
The $SLURM_ARRAY_TASK_ID
can be used inside the job script to handle input and output files for that task.
For example, for a job arrays with input files named input_1.txt
, input_2.txt
etc one can refer the the input files as input_${SLURM_ARRAY_TASK_ID}.txt
. The output files can he handled the same way.