I have a list of 100 files in directory1 with .dll extension:
directory1/file1.dll
directory1/file2.dll
...
directory1/file100.dll
There is another, directory2, with 10000 files with .dll extension, located among many subdirectories:
directory2/subdirectory1/file1.dll
directory2/subdirectory2/file3.dll
...
directory2/subdirectory3/file10000.dll
I need to compare if 100 files of same name from directory1 exist in directory2 and then copy found ones to directory3.
How can I do it in most efficient way? Thank you in advance.
CodePudding user response:
Try this Shellcheck-clean code:
#! /bin/bash -p
dir1=directory1
dir2=directory2
dir3=directory3
shopt -s dotglob globstar nullglob
# Set up an associative array to record which files are in "$dir1"
declare -A is_in_dir1
for path in "$dir1"/*.dll; do
file=${path##*/}
is_in_dir1[$file]=1
done
for path in "$dir2"/**/*.dll; do
file=${path##*/}
if (( ${is_in_dir1[$file]-0} )); then
cp -n -v -- "$path" "$dir3"
fi
done
- You'll need to change the
dir1
,dir2
, anddir3
settings. shopt -s
sets some Bash configurations required by the code:dotglob
causes glob patterns (e.g.*.dll
) to match names that begin with.
. You might not want that.globstar
enables the use of**
to match paths recursively through directory trees.nullglob
makes globs expand to nothing when nothing matches (otherwise they expand to the glob pattern itself, which is almost never useful in programs).
- See BashGuide/Arrays - Greg's Wiki (last section) for information about associative arrays in Bash.
- See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for an explanation of
${path##*/}
. - See glob - Greg's Wiki for information about globbing in general, and
globstar
and the**
pattern in particular. - The code should work with any file or directory names (including ones with spaces or newlines in them).
- I can't say that the code implements the "most efficient way", but it scans both
"$dir1"
and"$dir2"
only once and I can't think of a way to make it significantly faster.
CodePudding user response:
Assuming you have no spaces (or other unpleasant chars in filenames):
cd /path/to/first_directory
for file in *.dll; do
find /path/to/second_directory/ -name "$file" | xargs -I ABCD cp ABCD /path/to/third_dir
done
find
goes through subdirectories. ABCD is placeholder. You can see details in man xargs
:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not terminate input items; instead the separator is the newline character. Implies
-x
and-L 1
.