I am trying to use GNU Parallel to run a script that has multiple binary flags. I would like to enable/disable these as follows:
Given a script named "sample.py
", with two options, "--seed
" which takes an integer and "--something
" which is a binary flag and takes no input, I would like to construct a call to parallel that produces the following calls:
python sample.py --seed 1111
python sample.py --seed 1111 --something
python sample.py --seed 2222
python sample.py --seed 2222 --something
python sample.py --seed 3333
python sample.py --seed 3333 --something
I've tried things like
parallel python sample.py --seed {1} {2} ::: 1111 2222 3333 ::: "" --something
parallel python sample.py --seed {1} {2} ::: 1111 2222 3333 ::: '' --something
parallel python sample.py --seed {1} {2} ::: 1111 2222 3333 ::: \ --something
but haven't had any luck. Is what I'm trying to achieve possible with GNU parallel? I can modify my script to take explicit TRUE/FALSE values for the flag but I'd prefer to avoid that if possible.
CodePudding user response:
> bash$ cat sample.py
#!/usr/bin/python3
import sys
import time
time.sleep(0.2)
print(sys.argv)
> bash$ cat split.sh
#!/bin/sh
exec $*
> bash$ for seed in 1111 2222 3333; do \
printf "%s\0" "$seed" "$seed --something"; \
done \
| xargs -0 parallel \
sh split.sh python3 sample.py --
['sample.py', '1111', '--something']
['sample.py', '1111']
['sample.py', '2222']
['sample.py', '2222', '--something']
['sample.py', '3333', '--something']
['sample.py', '3333']
Explanation:
The first part, the loop, just creates the list of arguments:
1111
1111 --something
2222
2222 --something
3333
3333 --something
xargs
will send that list from stdin to parallel
's arguments.
split.sh
splits its arguments by whitespace - here we're assuming that the arguments to your script don't have whitespace in them.
So we call sh split.sh
which will basically execute the command by splitting arguments like 2222 --something
to 2222
and --something
.
Those arguments will be passed to python3 sample.py
, so you get a shell command like python3 sample.py 2222 --something
that is run by parallel
.
If we didn't use split.sh
and just called python directly (xargs -0 parallel python3 sample.py --
), then when xargs
passes 2222 --something
as a single argument, parallel
would have ran something like python3 sample.py '2222 --something'
.
CodePudding user response:
You are so close.
GNU Parallel quotes replacement strings. That usually makes sense, because it is then safe to give it filenames like:
My brother's 12" records, all with ***.csv
which could otherwise give no end of troubles.
However, to be consistent GNU Parallel also quotes the empty string. And that is what is hitting you here.
--dry-run
shows what is going on:
$ parallel --dry-run python sample.py --seed {1} {2} ::: 1111 2222 3333 ::: '' --something
python sample.py --seed 1111 ''
python sample.py --seed 1111 --something
python sample.py --seed 2222 ''
python sample.py --seed 2222 --something
python sample.py --seed 3333 ''
python sample.py --seed 3333 --something
So how can you avoid that?
You can tell the shell to evaluate all strings:
parallel eval python sample.py --seed {1} {2} ::: 1111 2222 3333 ::: '' --something
but that might be a bit of a blunt hammer when you need a scalpel. From version 20190722 you can also use {=uq=}
. uq()
is a perl function which tells GNU Parallel that this replacement string should not be quoted:
$ parallel-20190722 --dry-run python sample.py --seed {1} {=2 uq=} ::: 1111 2222 3333 ::: '' --something
python sample.py --seed 1111
python sample.py --seed 1111 --something
python sample.py --seed 2222
python sample.py --seed 2222 --something
python sample.py --seed 3333
python sample.py --seed 3333 --something