How to run multiple tasks in the same time in a loop-CodePudding

I have a script like this

for i in `seq 100`
do
echo $i
some-command $i # will run for 1 minutes
done

I would like to run 10 some-command tasks in the same time. how can I do this?

for i in `seq 1 10 100` # step 10
do
echo $i
some-command $i&
some-command $I 1&
some-command $I 2&
some-command $I 3&
some-command $I 4&
some-command $I 5&
some-command $I 6&
some-command $I 7&
some-command $I 8&
some-command $I 9&

wait
done

CodePudding user response：

IMO, no need for a loop:

seq 100 | xargs -n 1 -P 10 some-command

CodePudding user response：

You could use GNU parallel, which is designed especially for this:

seq 100 | parallel -j10 'some-command {}'

Or GNU make which is much more than a parallelizing tool but can perfectly do it:

$ cat Makefile
JOBS := $(shell seq 100)
.PHONY: all $(JOBS)
all: $(JOBS)
JOBS:
    some-command $@

$ make -j10

Warning: if you copy-paste this in a Makefile do not forget to replace the 4 leading spaces before some-command $@ by a tab.

CodePudding user response：

You can create threads in bash:

for i in `seq 100` ;do 
  echo $i
  sleep 5 &
  while [ $(jobs | wc -l) -ge 10 ]; do
    sleep 1
  done
done

The & after a command will spawn it in it's own thread. You can put the ampersand after commands or things like loops. If you have multiple commands you can surround them with () and put the ampersand after that.

In my example the "sleep 5" is the line which is your long lasting command.

The added while loop is to prevent the thing to spawn all the threads at once. The example will limit at 10.

CodePudding user response：

If you want to run commands in parallel in a controlled manner (i.e. (1) limit the number of parallel commands, (2) track their return statuses and (3) ensure that new commands are started once their predecessors finish, until all commands have run), you can reuse a simple harness, copied from my other answer here.

Just plug in your preferences, replace do_something_and_maybe_fail with the programs you want to run (which you can iterate through by modifying the place where pname is generated (some_program_{a..f}{0..5}) and you’re good to go.

The harness is runnable as-is. Its processes randomly sleep and randomly fail and there are 20 execution slots (MAX_PARALLELISM) for 36 “commands” (some_program_{a..f}{0..5}), so, quite obviously, a few commands will need to wait for others to finish (so that at most 20 of them run in parallel).

#!/bin/bash
set -euo pipefail

declare -ir MAX_PARALLELISM=20  # pick a limit
declare -i pid
declare -a pids=()

do_something_and_maybe_fail() {
  sleep $((RANDOM % 10))
  return $((RANDOM % 2 * 5))
}

for pname in some_program_{a..f}{0..5}; do  # 36 items
  if ((${#pids[@]} >= MAX_PARALLELISM)); then
    wait -p pid -n \
    && echo "${pids[pid]} succeeded" 1>&2 \
    || echo "${pids[pid]} failed with ${?}" 1>&2
    unset 'pids[pid]'
  fi

  do_something_and_maybe_fail &  # forking here
  pids[$!]="${pname}"
  echo "${#pids[@]} running" 1>&2
done

for pid in "${!pids[@]}"; do
  wait -n "$((pid))" \
  && echo "${pids[pid]} succeeded" 1>&2 \
  || echo "${pids[pid]} failed with ${?}" 1>&2
done