Iterate over multiple files, remove those who contains x specific characters-CodePudding

New to Shell. I have more than 10 thousand files and I have to delete files that contain the "<" characters less than 10 times.

wc -l * 2>&1 | while read -r num file; do ((num < 10)) && echo rm "$file"; - this one removes files if they have less than 10 lines, but how do I put "<" character?

CodePudding user response：

With GNU grep, bash and GNU xargs:

#!/bin/bash

grep -cZ '<' * |
while IFS='' read -r -d '' file && read count
do
    (( count < 10 )) && printf '%s\0' "$file"
done |
xargs -0r rm

Explanations

grep -cZ outputs a stream of file \0 count \n records.

You process it with a while loop that reads the file (using a NUL-byte delimiter) and the count (using a newline delimiter).
You do your filtering logic and output the files that you want to delete (in the form of NUL-delimited records).

Finally, xargs -r0 rm does the deletion of the files

Here's an alternative with GNU awk and xargs:

awk -v n=10 '
    FNR == 1 {
        count = 0
    }
    /</ &&   count >= n {
        nextfile
    }
    ENDFILE {
        if (count < n)
            printf "%s%c", FILENAME, 0
    }
' * |
xargs -0r rm
`

CodePudding user response：

Using GNU grep (for the -m option, to make it a bit more efficient):

#!/bin/bash

for f in *; do
    (( $(grep -Fc -m10 \< "$f") < 10 )) && echo rm "$f"
done

Drop the echo if output looks fine.