Home > database >  Display progress with script running in paralel
Display progress with script running in paralel

Time:10-28

I have a bash script that runs in parallel inside a for loop, like this:

for (( n=0; n<${LIMIT}; n   )); do
   my_function $n &

my_function is a function that runs in LIMIT virtual machines/nodes with ssh. It works fine, but i am not able to keep track of execution progress. It would be nice to have a progress bar or any other tool that can give a completion estimate. Unfortunately, all approaches i've searched doesn't apply to this parallel case.

Thanks

CodePudding user response:

The below script uses the widely-packaged 3rd-party tool pv ("pipe viewer") to write a progress bar.

pv reads input from stdin and writes it to stdout -- but can write a progress bar (and optionally, an ETA and other statistics) to stderr in the process. It's most commonly used in situations like curl | pv | tar, but is perfectly suitable for this use case as well.

#!/usr/bin/env bash

# version check: make sure we have automatic FD allocation and redirection to
# variable-provided FDs available in the version of bash in use
case $BASH_VERSION in
  ''|[1-3].*|4.[012].*) echo "ERROR: bash 4.3  required" >&2; exit 1;;
esac

# Stub for OP's real function they want output from
my_function() {
  # simulate doing something slow that takes a differing amount of time
  sleep $(( 2   (RANDOM % 10) ))
}

# note "limit" is lowercase -- that's per POSIX guidelines
# see https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html
limit=100 # as an example
pids=( )

# allocate a file descriptor that writes to a copy of "pv" -- "pipe viewer"
exec {pv_fd}> >(exec pv --progress --size "$limit" >/dev/null)
pv_pid=$!

for (( n=0; n<limit; n   )); do
  # every time my_function completes, write to the copy of pv
  (my_function "$n" {pv_fd}>&-; rc=$?; printf '\n' >&$pv_fd; exit "$rc") & pids[n]=$!
done

exec {pv_fd}>&- # now, close the process substitution feeding to pv

# pause here until all copies of my_function have completed
# by looping and passing each copy of wait only one PID, we can retrieve exact
# exit status for each copy.
for n in "${!pids[@]}"; do
  pid=${pids[$n]}
  wait "$pid" || echo "WARNING: Process $n exited with status $?" >&2
done

wait "$pv_pid" # and wait for pv to exit

echo "Finished running all instances of my_function" >&2

CodePudding user response:

With GNU Parallel you can do this:

export -f my_function
seq 0 ${LIMIT} |
  parallel -j${LIMIT} --bar my_function
  • Related