Is there a way to run scripts with dependencies via GNU parallel
?
I wish to run the following scripts:
aa_00.sh # run time ~6 hr
aa_01.sh # dependent on aa_00.sh; ~6 hr
aa_02.sh # dependent on aa_00.sh; ~6 hr
aa_03.sh # dependent on aa_00.sh; ~6 hr
bb_00.sh # run time ~2 hr
bb_01.sh # dependent on bb_00.sh; ~2 hr
bb_02.sh # dependent on bb_00.sh; ~2 hr
bb_03.sh # dependent on bb_00.sh; ~2 hr
Scripts aa_01.sh
, aa_02.sh
, and aa_03.sh
must not run until script aa_00.sh
completes.
Scripts aa_01.sh
, aa_02.sh
, and aa_03.sh
are completely independent of each other and can run in parallel.
Similarly, scripts bb_01.sh
, bb_02.sh
, and bb_03.sh
must not run until script bb_00.sh
completes.
Scripts bb_01.sh
, bb_02.sh
, and bb_03.sh
are completely independent of each other and can run in parallel.
I have 4 CPUs [*].
[*] Actually, I am using GPUs so I am using:
'eval CUDA_VISIBLE_DEVICES={%} {}'
# i removed the "({%} - 1)" notation just for simplicity here
Is there a way to run these 8 scripts efficiently such that the dependencies on aa_00.sh
and bb_00.sh
are respected?
One idea I had was at the completion of aa_00.sh
, release the subsequent aa_{1,2,3}.sh
scripts via parallel
. And at the completion of bb_00.sh
, release the subsequent bb_{1,2,3}.sh
scripts via parallel
. But because two different runs of parallel
are used, the bb_*
scripts don't know that aa_*
scripts are running (and vice versa):
cat commands_aa.txt
aa_01.sh
aa_02.sh
aa_03.sh
CUDA_VISIBLE_DEVICES=0 aa_00.sh
parallel -j4 -a commands_aa.txt 'eval CUDA_VISIBLE_DEVICES={%} {}'
cat commands_bb.txt
bb_01.sh
bb_02.sh
bb_03.sh
CUDA_VISIBLE_DEVICES=1 bb_00.sh
parallel -j4 -a commands_bb.txt 'eval CUDA_VISIBLE_DEVICES={%} {}'
Conceptually, I'd like to add inputs to an already running parallel
command. I tried overwriting the -a commands.txt
file when parallel
was already running but that did not achieve what I wanted (I would have been shocked if that did work).
In actuality, I have more than just aa
and bb
scripts; I have as many as 8 or 10 (ie, aa
, bb
, ..., hh
, ii
, ...). And I have more than 3 scripts that run after the _00
script; I have 12 in total: _00
_01
, ..., _11
. All of them have the dependency on their respective _00
script.
I was looking at the python library luigi
, too. luigi
can handle dependencies but I don't think it can handle parallelization. I also looked at the python module joblib.Parallel()
. Perhaps I need to combine luigi
and joblib.Parallel()
.
Thank you.
Additional Thoughts
- I do think what I need is to have each
_00
script add its dependents upon its completion. - But I need to add these dependents to the list that
parallel
is already working on.
Something like this (conceptually):
commands.txt
contains:
aa_00.sh
bb_00.sh
- run
parallel
:
parallel -j4 -a commands.txt 'eval CUDA_VISIBLE_DEVICES={%} {}'
CUDA_VISIBLE_DEVICES=1
<--aa_00.sh
CUDA_VISIBLE_DEVICES=2
<--bb_00.sh
when
bb_00.sh
completes it appends its dependencies to the bottom ofcommands.txt
, like so:commands.txt
updated:
aa_00.sh # still running on GPU 1
bb_00.sh # this completed on GPU 2
bb_01.sh # these new scripts are
bb_02.sh # appended to
bb_03.sh # commands.txt
Somehow,
parallel
magically is okay with these new lines of input and these new scripts are queued to GPUs 3, 4, and 2.CUDA_VISIBLE_DEVICES=3
<--bb_01.sh
CUDA_VISIBLE_DEVICES=4
<--bb_02.sh
CUDA_VISIBLE_DEVICES=2
<--bb_03.sh
CUDA_VISIBLE_DEVICES=1
<-- still runningaa_03.sh
bb_01.sh
completes on GPU 3; no dependencies so nothing is appended tocommands.txt
The joblog would look something like:
aa_00.sh GPU=1 running
bb_00.sh GPU=2 completed
bb_01.sh GPU=3 completed
bb_02.sh GPU=4 running
bb_03.sh GPU=2 running
Eventually
aa_00.sh
completes so it appends its dependencies to the bottom ofcommands.txt
.commands.txt
updated:
aa_00.sh # completed on GPU 1
bb_00.sh # completed on GPU 2
bb_01.sh # completed on GPU 3
bb_02.sh # running on GPU 4
bb_03.sh # running on GPU 2
aa_01.sh # these new scripts are
aa_02.sh # appended to
aa_03.sh # commands.txt
Again,
parallel
is magically okay with these new lines of input so it dishes out the new scripts to available GPUs.CUDA_VISIBLE_DEVICES=3
<--aa_01.sh
CUDA_VISIBLE_DEVICES=1
<--aa_02.sh
Suppose
bb_02.sh
completes next, freeing up GPU 4.CUDA_VISIBLE_DEVICES=4
<--aa_03.sh
Now the joblog looks something like:
aa_00.sh GPU=1 completed
bb_00.sh GPU=2 completed
bb_01.sh GPU=3 completed
bb_02.sh GPU=4 completed
bb_03.sh GPU=2 completed
aa_01.sh GPU=3 running
aa_02.sh GPU=1 running
aa_03.sh GPU=4 running
(I may have mixed up the numbering and surely the timing isn't correct (aa runs 3x longer than bb), but hopefully I explained the ordering correctly.)
It's the "magical" part of parallel
that I'm unsure of.
CodePudding user response:
Look at https://www.gnu.org/software/parallel/man.html#example-gnu-parallel-as-queue-system-batch-manager
So something like:
true >jobqueue; tail -n 0 -f jobqueue | parallel -j4 'eval CUDA_VISIBLE_DEVICES={%} {}'
echo "aa_00.sh; (echo aa_01.sh; echo aa_02.sh; echo aa_03.sh) >> jobqueue" >> jobqueue
echo "bb_00.sh; (echo bb_01.sh; echo bb_02.sh; echo bb_03.sh) >> jobqueue" >> jobqueue
We are clearly in territory where there must be better tools: GNU Parallel does not have a dependency graph like make
has.