I have a bash script that runs in parallel inside a for loop, like this:
for (( n=0; n<${LIMIT}; n )); do
my_function $n &
my_function
is a function that runs in LIMIT
virtual machines/nodes with ssh. It works fine, but i am not able to keep track of execution progress. It would be nice to have a progress bar or any other tool that can give a completion estimate. Unfortunately, all approaches i've searched doesn't apply to this parallel case.
Thanks
CodePudding user response:
The below script uses the widely-packaged 3rd-party tool pv
("pipe viewer") to write a progress bar.
pv
reads input from stdin and writes it to stdout -- but can write a progress bar (and optionally, an ETA and other statistics) to stderr in the process. It's most commonly used in situations like curl | pv | tar
, but is perfectly suitable for this use case as well.
#!/usr/bin/env bash
# version check: make sure we have automatic FD allocation and redirection to
# variable-provided FDs available in the version of bash in use
case $BASH_VERSION in
''|[1-3].*|4.[012].*) echo "ERROR: bash 4.3 required" >&2; exit 1;;
esac
# Stub for OP's real function they want output from
my_function() {
# simulate doing something slow that takes a differing amount of time
sleep $(( 2 (RANDOM % 10) ))
}
# note "limit" is lowercase -- that's per POSIX guidelines
# see https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html
limit=100 # as an example
pids=( )
# allocate a file descriptor that writes to a copy of "pv" -- "pipe viewer"
exec {pv_fd}> >(exec pv --progress --size "$limit" >/dev/null)
pv_pid=$!
for (( n=0; n<limit; n )); do
# every time my_function completes, write to the copy of pv
(my_function "$n" {pv_fd}>&-; rc=$?; printf '\n' >&$pv_fd; exit "$rc") & pids[n]=$!
done
exec {pv_fd}>&- # now, close the process substitution feeding to pv
# pause here until all copies of my_function have completed
# by looping and passing each copy of wait only one PID, we can retrieve exact
# exit status for each copy.
for n in "${!pids[@]}"; do
pid=${pids[$n]}
wait "$pid" || echo "WARNING: Process $n exited with status $?" >&2
done
wait "$pv_pid" # and wait for pv to exit
echo "Finished running all instances of my_function" >&2
CodePudding user response:
With GNU Parallel you can do this:
export -f my_function
seq 0 ${LIMIT} |
parallel -j${LIMIT} --bar my_function