Home > Back-end >  Slurm job arrays don't work when used in argparse
Slurm job arrays don't work when used in argparse

Time:04-24

I am trying to run multiple things at once (i.e. in a parallel manner) with different values of the variable --start_num. I have designed the following bash script,

#!/bin/bash

#SBATCH --job-name fmriGLM #job name을 다르게 하기 위해서
#SBATCH --nodes=1
#SBATCH -t 16:00:00 # Time for running job
#SBATCH -o /scratch/connectome/dyhan316/fmri_preprocessing/FINAL_loop_over_all/output_fmri_glm.o%j #%j : job id 가 들어가는 것
#SBATCH -e /scratch/connectome/dyhan316/fmri_preprocessing/FINAL_loop_over_all/error_fmri_glm.e%j
pwd; hostname; date
#SBATCH --ntasks=30
#SBATCH --mem-per-cpu=3000MB
#SBATCH --cpus-per-task=5
#SBATCH -a 0-5

python FINAL_ARGPARSE_RUN.py --n_division 30 --start_num $SLURM_ARRAY_TASK_ID

Then, I ran sbatch --exclude master array_bash_2, but it doesn't work. I have tried searching many sites and have tried multiple things, but still the error FINAL_ARGPARSE_RUN.py: error: argument --start_num: expected one argument pops out in the error file, making me feel that the $SLURM_ARRAY_TASK_ID in the bash script hasn't been properly done...?

Could anyone explain why this is and how I can fix it?

Thank you!

CodePudding user response:

The problem seems to be in your line pwd; hostname; date. Don’t add any non-SBATCH lines in between #SBATCH directives as Slurm will stop processing at that point, meaning you are not submitting an array job, but just a single job. Move that line after the last #SBATCH line and it should work now.

  • Related