How do I use grep to count the number of occurrences of a string-CodePudding

How do I use grep to count the number of occurrences of a string?

input:

.
├── a.txt
├── b.txt

// a.txt
aaa

// b.txt
aaa
bbb
ccc

Now I want to know how many times aaa and bbb appear.

output:

aaa: 2
bbb: 1

CodePudding user response：

Just an idea:

grep -E "aaa|bbb|ccc" *.txt | awk -F: '{print $2}' | sort | uniq -c

This means:

grep -E "...|..." : extended grep, look for all entries

The result is given as:
a.txt:aaa
b.txt:aaa
b.txt:bbb
b.txt:ccc

awk -F: '{print $2}' : split the result in 2 columns, 
                       based on the semicolon, 
                       and only show the second column

sort | uniq -c : sort and count unique entries

CodePudding user response：

You can try awk. This uses split to count the occurrences of the search patterns and puts them in the "associative" array n.

$ awk 'BEGIN{ pat1="aaa"; pat2="bbb" } 
    { n[pat1] =(split($0,arr,pat1)-1) } 
    { n[pat2] =(split($0,arr,pat2)-1) } 
    END{ for(i in n){ print i":",n[i] } }' a.txt b.txt
aaa: 10
bbb: 14

Example data

$ cat a.txt
aaa
aaa efwepom dq
bbb qwpdo bbb
qwdo qwdpomaaa
qwo bbb
pefaaaomaaaewe bb aa
aaa bbb

$ cat b.txt
aaa
aaa efwepom dq
bbb qwpdo bbb
qwdo qwdpomaaa
qwo bbb
pebbb bbb fobbbmebbbwe bb aa
aaa bbb
bbbbbbsad

CodePudding user response：

Assuming you do not want them separated out by filename, since you specify grep, I'd loop the inputs. The problem is if you have more than one item on a single line.

for pattern in aaa bbb; do 
  printf "%s: " "$pattern"
  cat a.txt b.txt | grep -Ec "$pattern"
done
aaa: 2
bbb: 1

awk is cleaner though.