Home > database >  bash - parallel loop with pool of resources
bash - parallel loop with pool of resources

Time:08-03

I have a script that looks like this:

for x in ...; do
    for y in ...; do
        # run several commands which depends on x and y and requires a single GPU
        # (I also need to specify which GPU to use)
        command1 $x $y GPU0
        command2 $x $y GPU0
    done
done

# Some stuff after the loop

I have 4 GPUs. I want to make the loop parallel. I.e. for the current (x,y) iteration, I want to wait until some GPU is available, run the commands, and go to the next iteration (without waiting for the current iteration to finish). How do I do this?

I know about flock command, so I can create a lock file for each GPU and use it to control access to the GPU. But, as I understand, it requires me to know which GPU my current (x,y) iteration plans to use.

And another concern is how to guarantee that at every iteration I use correct x and y. I.e., when we go to the next iteration, x and y change, and it must not be reflected in command1 $x $y GPU... at the previous iteration.

CodePudding user response:

I assume that you're not targeting macOS because you're using GPUs, so here's a bash-4 solution for getting you started (you might want to improve on it by defining functions, traps, etc...):

#!/bin/bash

declare -A avail_gpus=( [GPU0]= [GPU1]= [GPU2]= [GPU3]= )
declare -a procs_gpus=( )

for x in x1 x2 x3
do
    for y in y1 y2 y3
    do
        while :
        do
            for pid in "${!procs_gpus[@]}"
            do
                kill -0 "$pid" 2> /dev/null && continue
                echo "freed ${procs_gpus[pid]}" # demo
                avail_gpus["${procs_gpus[pid]}"]=
                unset procs_gpus[pid]
            done
            (( ${#avail_gpus[@]} > 0 )) && break
            sleep .5
        done
        for gpu in "${!avail_gpus[@]}"; do break; done
        {
            echo "$x" "$y" "$gpu"         # demo
            sleep "$((1   $RANDOM % 10))" # demo
            #command1 "$x" "$y" "$gpu"
            #command2 "$x" "$y" "$gpu"
        } &
        procs_gpus[$!]=$gpu
        unset avail_gpus["$gpu"]
    done
done

wait

with the above code you'll get something like:

x1 y1 GPU2
x1 y2 GPU3
x1 y3 GPU0
x2 y1 GPU1
freed GPU0
x2 y2 GPU0
freed GPU2
x2 y3 GPU2
freed GPU3
x3 y1 GPU3
freed GPU0
freed GPU2
x3 y2 GPU2
x3 y3 GPU0

CodePudding user response:

If you have GNU Parallel:

parallel -j4 'echo Do {1} and {2} on GPU {%}; sleep 1.$RANDOM;' ::: a b c ::: X Y Z

To follow the progress use --lb:

parallel -j4 --lb 'echo Do {1} and {2} on GPU {%}; sleep 1.$RANDOM; echo GPU {%} done' ::: a b c ::: X Y Z
  • Related