Home > Back-end >  Rsync on multiple hosts in parallel
Rsync on multiple hosts in parallel

Time:11-09

I need to send frequently a lot of files to a multiple hosts and is crucial to be fast and I want it to do it in parallel.

how can I run in a bash script a parallel rsync to multiple hosts?

now the script looks like this

   for i in ${listofhosts[*]}
   do
   rsync -rv --checksum  folder/ -e "ssh -i rsa_key -o 
   StrictHostKeyChecking=no" user@$i:/var/test/folder --delete  || 
   exit 1
   done

LE: I'm thinking of something with GNU Parallel or xargs but I don't know how to use them in this situation

CodePudding user response:

With just a shell script,

#!/bin/bash
procs=()
for i in "${listofhosts[@]}"; do  # notice syntax fixes
  rsync -rv --checksum  folder/ -e "ssh -i rsa_key -o 
   StrictHostKeyChecking=no" user@$i:/var/test/folder --delete &
  procs =($!)
done
for proc in "${procs[@]}"; do
  wait "$proc"
done

The obvious drawback is that you can't cancel the others as soon as one of them fails. If you really have "a lot" of hosts, this will probably saturate your network bandwidth to the point where you regret asking about how to do this.

With xargs, you can limit how many instances you run:

# probably better if you have the hosts in a file instead of an array actually,
# and simply run xargs <filename -P 17 -n 1 ...
printf '%s\n' "${listofhosts[@]}" |
xargs -P 17 -n 1 sh -c 'rsync -rv --checksum  folder/ -e "ssh -i rsa_key -o 
   StrictHostKeyChecking=no" user@"$0":/var/test/folder --delete || exit 1'

Perhaps notice how we sneakily smuggle in the host in $0. You could equivalently but slightly less obscurely populate $0 with a dummy string and use $1, but it doesn't really make a lot of difference here.

The -P 17 says to run a maximum of 17 processes in parallel (obviously, tweak to your liking), and -n 1 says to only run one instance of the command line at a time. xargs still does not offer a way to interrupt the entire batch if one of the processes fails, and only reports back summaric result codes (like, the exit code from xargs will be non-zero if at least one of the processes failed).

CodePudding user response:

With GNU Parallel it should be something like this:

printf '%s\n' "${listofhosts[@]}" | parallel --will-cite --halt now,fail=1 rsync -rv --delete --checksum  -e $(printf '%q' 'ssh -i rsa_key -o StrictHostKeyChecking=no') folder/ user@{}:/var/test/folder

The tricky part is that you have to explicitly escape the arguments containing spaces (and other characters that are in $IFS).

Note: You can limit the number of rsync that run in // with the -j option of parallel:

... | parallel -j 8 --will-cite --halt now,fail=1 rsync -rv ...
  • Related