Home > OS >  The right place for `wait` in nested bash loops?
The right place for `wait` in nested bash loops?

Time:10-18

I have tailored my jobs so that 4 of them each work nicely on a 4-vcpu instance, and have a bash script that launches 4 processes for each of them:

for pidx in {0..3}; do
    prname=$1"-"$pidx
    echo "start java" `date` $prname
    { (java -server -Xmx4g -jar $JARPATH "$prname" $DATADIR) } &
    sleep 10; echo "end sleep" $prname `date` 
done

Now I want to be able to queue up several such jobs, waiting for all four processes to finish before I launch the next set of 4. Here's one attempt:

TSTJOBS="job1 job2"
for JNAME in $TSTJOBS; do
    echo "startloop "  $(date  %T) $JNAME $$
    for pidx in {0..3}; do
        (
        prname=$JNAME"-"$pidx
        echo "start java" $prname $(date  %T) $!
        (java -server -Xmx4g -jar $JARPATH "$prname" $DATADIR) &
        pid=$!
        sleep 10; echo "end java  " $prname $(date  %T) $pid
        )
        wait $pid
      done
    echo "eoloop    "  $(date  %T) $JNAME 
done

I've tried several places for grabbing the processID and for the wait command but either:

  • I wait for *all 8 sub-jobs to terminate before proceeding,
  • all jobs are run simultaneously, or
  • I get "pid XXX is not a child of this shell" errors?

Where should I be putting the wait statements in these nested loops?

CodePudding user response:

Consider adding a function to take care of the individual job details, eg:

run_job() {
    prname="$1-$2"
    echo "start java $prname $(date  %T)"
    java -server -Xmx4g -jar "$JARPATH" "$prname" "$DATADIR"
    echo "end java $prname $(date  %T)"
}

Now the main script becomes:

TSTJOBS="job1 job2"

for JNAME in $TSTJOBS; do
    echo "startloop  $(date  %T) $JNAME"

    for pidx in {0..3}; do
        run_job "${JNAME}" "${pidx}" &               # place function call in background; NOTE: all function output will print to console so you'll get a mix of 4x sets of output spewed to the console
    done

    wait                                             # wait for all 4x function calls to complete before running next iteration of outer loop
    echo "eoloop     $(date  %T) $JNAME"
done

If for some reason you can't (or don't want to) use the function you can accomplish the same thing inline, eg:

for JNAME in $TSTJOBS; do
    echo "startloop  $(date  %T) $JNAME"

    for pidx in {0..3}; do
        { prname="$1-$2"
          echo "start java $prname $(date  %T)"
          java -server -Xmx4g -jar "$JARPATH" "$prname" "$DATADIR"
          echo "end java   $prname $(date  %T)"
        } &
    done

    wait
    echo "eoloop     $(date  %T) $JNAME"
done

Or since you really only need to match ending echo calls with the associated java call, a mix will also suffice, eg:

for JNAME in $TSTJOBS; do
    echo "startloop  $(date  %T) $JNAME"

    for pidx in {0..3}; do
        prname="$1-$2"
        echo "start java $prname $(date  %T)"
        { java -server -Xmx4g -jar "$JARPATH" "$prname" "$DATADIR"
          echo "end java   $prname $(date  %T)"
        } &
    done

    wait
    echo "eoloop     $(date  %T) $JNAME"
done
  • Related