Home > Blockchain >  How to debug the job array for SLURM through two loops?
How to debug the job array for SLURM through two loops?

Time:03-04

I need to submit many jobs for the cluster by slurm. Each job takes different input files from different folders. My problem is the output is incomplete, and outputs after the first 8 combinations keep overwriting the previous ones. I suspected the job array is not correctly created from the combination of the two variables provided. Here is my code sample:

#!/bin/bash

#SBATCH --array=1-57         
#SBATCH --time=0            
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=12
#SBATCH --mem=6G
#SBATCH --output=/storage/proj/AltSp/logs/Lastz_Intron.log

DIR_OUT="/storage/proj/AltSp/data/annotation/Lastz/Br"
mkdir -p ${DIR_OUT}

QUERY="/storage/proj/AltSp/data/annotation/Introns.txt"
Species=/storage/proj/AltSp/data/Species.list   #3 lines: Br\nBn\nBo\n

# Chroms=/storage/proj/AltSp/genomes/Br/chromosomes.list # 20 lines: A1 ~ A20, one at a line
# Chroms=/storage/proj/AltSp/genomes/Bn/chromosomes.list # 18 lines: B1 ~ B18, one at a line
# Chroms=/storage/proj/AltSp/genomes/Bo/chromosomes.list # 19 lines: C1 ~ C19, one at a line

# REF is changing according to spc and chr

for spc in $(cat ${Species}); do
    chr=$(head -n ${SLURM_ARRAY_TASK_ID} genomes/${spc}/chromosomes.list | tail -1)
    REF="/storage/proj/AltSp/genomes/${spc}/${chr}.fasta"
    
    lastz ${REF} ${QUERY} K=3000 H=2200 --format=axt  > ${DIR_OUT}/introns_vs_${spc}-${chr}.axt
done

Outputs files are:

    introns_vs_Br-A01.axt
    introns_vs_Br-A02.axt
    ...
    introns_vs_Br-A08.axt
    

spc is in a single file, one name/string in a line; chr is in several files, also one name/string in a line in each file; REF is changing according to the spc and chr of different combinations to give 57 files in total. The 57 jobs [array] are submitted with sbatch to run 12 jobs at a time in my allocation.

What was wrong with the SLURM_ARRAY_TASK_ID job array created by looping through the two variables spc and chr in my sample code that over-writes the output? Thanks!

CodePudding user response:

I think, the issue might be associated with how you obtain $chr. To verify, add ${SLURM_ARRAY_TASK_ID} to your job output file. For example, like this:

lastz ${REF} ${QUERY} K=3000 H=2200 --format=axt  > ${DIR_OUT}/introns_vs_${spc}-${chr}-task${SLURM_ARRAY_TASK_ID}.axt

So, if you get 57 outputs generated, then the issue is associated with how you obtain $chr.

  • Related