I have a structure of directories and subdirectories that in the end point contain some files of some extension (say jpg files) The structure of the directories is not set. So it can be something like
top_directory
|__child1
| |__one
| |_two
|
|__child2
| |_three
|
|__child3
|_child3_1
|__four
|__five
|__six
How can make a script that counts the number of files of said extension in the sub directories where there exist.
In the past where there were only one level of subdirectories I did something like
for entry in ./*/
do
echo "$entry"
ls "$entry"/*.jpg -l | wc -l
done
this iterated with entry
through all subdirectories and counted the files .
However this obviously does not work when there are sub sud directories.
CodePudding user response:
Here's a not particularly clever way of doing it (that does effectively what you're does but recursively AND doesn't solve that the file names don't mean they are JPG) -
( find . -type d -print | while read line; do echo "$line" $( ls -1 "$line"/*.jpg 2>/dev/null | wc -l); done ) | grep -v ' 0$'
Something quite similar to your request has been answered in details at unix & linux SO
CodePudding user response:
Using GNU find for -printf
.
find /top/dir -type f -name '*.jpg' -printf . | wc -c
Unlike ls
(which generally you should not use in scripts), it works even if a filename contains a newline.
edit: Count files per sub-directory (asked in comment):
There's a few ways to do it, but maybe like this. It's good for interactive output (ie. to display to a user). You will see each subdirectory and its count. Except, dirs containing zero .jpg
files will not be listed (either a pro or a con, depending on use case).
find /top/dir -type f -name '*.jpg' -exec dirname -z -- {} |
sort -z |
uniq -zc |
sort -znk 1,1 |
tr '\0' '\n'
This requires GNU tools for the null delimiters (-z
flags). The second sort sorts counts, low to high. Add -r
(reverse) for high to low.
CodePudding user response:
@dan has a good approach, but a similar approach making use of a helper-script to count the files in each directory found is another simple and reasonably efficient way to do this. With your find
command you will recursively find the subdirectories below a given directory. You retrieve the directory names with:
find /top/dir -type d -print -exec ./helperf '{}' jpg \;
the -print
above is optional and simply outputs the current directory name before the helper script (helperf
) outputs the number of files in that directory. jpg
(or any file extension) is likewise optional and if omitted, all files in a given directory are counted. Since you invoke your helper script with -exec
you should make it executable (or include a full bash invocation for it)
The helper function, helperf
simply calls find
similar to how @dan proposes, but limits the -maxdepth
to 1
so only files in that directory are counted. Your helper script could be:
#!/bin/bash
[ -d "$1" ] && { ## first param is directory
if [ -n "$2" ]; then ## ext given as second param
find "$1" -maxdepth 1 -type f -name "*.$2" -printf . 2>/dev/null | wc -c
else ## no ext given, count all files
find "$1" -maxdepth 1 -type f -printf . 2>/dev/null | wc -c
fi
}
Above:
[ -d "$1" ]
serves as a simple validation ensuring the argument passed is a valid directory. If not, the script silently exits.if [ -n "$2" ]; then
check if a second extension argument was given and if so thefind
on files is limited to files ending in that extension. Without it, all files in the directory are counted.
Example Use/Output
Given my tmp
directory on this box has the structure:
tree -d
.
├── awk
├── clamav
│ └── src
└── st
Getting a check of all files results in:
$ find . -type d -print -exec ./helperf '{}' \;
.
40
./clamav
5
./clamav/src
0
./awk
2
./st
3
Which are the correct number of total files in the directory.
Now limiting to just .txt
files (of which there are 6 in the parent directory only), you would have:
$ find . -type d -print -exec ./helperf '{}' txt \;
.
6
./clamav
0
./clamav/src
0
./awk
0
./st
0
This seems to be close to what you are looking for. Look it over and let me know if you have further questions.