Home > database >  Holding job script after completion of one simulation
Holding job script after completion of one simulation

Time:12-17

I run multiple serial jobs on HPC. For example, if I have 10 simulations, I use 10 cores on HPC and use each core for a simulation. However, the end time of all these simulations is different and as soon as one simulation completes, all the others stop as well. How do I hold the job script so that even if one simulation is completed, others will keep running, in simple words, job script stays on HPC. An example of my job script:

#!/bin/bash
#SBATCH --job-name=CaseName    # name of the job
#SBATCH --ntasks=60        # number of requested cores
#SBATCH --cpus-per-task=1
#SBATCH --time=7-00:00:00    # time limit
#SBATCH --partition=core64    # queue

cd Folder1
for i in {1..5}
do
        cd Folder$i
        for j in {1..6}
        do
                cd SubFolder$j
                application > log 2>&1 &
                cd ..
        done
        cd ..
done
cd ..

cd LastFolder
application > log 2>&1

Is there any command I can add in job script to do so ?

Any command to use in job script to continue the jobs in hpc after simulation ends.

CodePudding user response:

You need a wait at the end of your script as you run the jobs in the background and you want exit from the script when all of them finished.

from man bash:

wait [-fn] [-p varname] [id ...]
              Wait  for  each  specified  child  process  and return 
              its termination status. ...
              ...
              If id is not given, wait waits for all running background jobs...

CodePudding user response:

There's something wrong with your cd logic.

Perhaps try running the cd and the application in a subshell, e.g.

(cd SubFolder$j ; application > log 2>&1 & )

Then, that way, you can be assured that every command run's concurrently and in their own subdirectory without impacting each other.

  • Related