Home > Enterprise >  RSYNC with many files using `files-from`
RSYNC with many files using `files-from`

Time:11-26

I have multiple files named like exported-0.txt and exported-1.txt and exported-2.txt (sequential suffixes)... and inside each file there is an absolute file path per line. For example, the file exported-1.txt has a content like this:

/directory1/img.png
/aabb/file.csv
/magic/file/boo/aaa/cc.jpg
...

I understand that rsync allows the usage of --files-from from where rsync will read all the files inside that file and use that to syncronize to another server. The problem is that I have thousands of exported-N.txt files and each file has thousands of lines.

So I am wondering the best approach to execute in parallel, for example 5 rsync, each call on a different file that contains thousands of lines of files that need to be syncronized.

I have almost no knowledge on command line in Linux so I have no idea where to start. I am wondering if I can use xargs to generate numbers from 1-9999 in order for it to call rsync on each file number. But I cant find a way to do that... Any suggestion?

CodePudding user response:

Here's a bash solution for processing your exported-*.txt files with n concurrent rsyncs.

The main idea is to concatenate a "computed" number of exported-*.txt files inside a process substitution <(...) and using the latter as file argument of --files-from:

#!/bin/bash
shopt -s nullglob

n=5

arr=( exported-*.txt )

(( len = ${#arr[@]} / n   1 ))
(( rem = ${#arr[@]} % n ))

idx=0
while (( idx < ${#arr[@]} ))
do
    (( rem-- == 0 )) && (( len-- ))
    rsync -a --files-from=<(cat -- "${arr[@]:idx:len}") /your/remote/dir/ &
    (( idx  = len ))
done

wait

CodePudding user response:

With bash:

for i in exported-*.txt; do
   rsync -a --files-from="$i" ...  
done

Replace -a with meaningful options and ... with source and target paths.

  • Related