Is there a way in linux to filter multiple files with bunch of data in one command without writing a script?
For this example I want to know how many males appear by date. Also the problem is that a specific date (January 3rd) appears in 2 seperate files:
file1
Jan 1 john male=yes
Jan 1 james male=yes
Jan 2 kate male=no
Jan 3 jonathan male=yes
file2
Jan 3 alice male=no
Jan 4 john male=yes
Jan 4 jonathan male=yes
Jan 4 alice male=no
I want the total amount of males for each date from all files. If there are no males for a specific date, no output will be given.
Jan 1 2
Jan 3 1
Jan 4 2
The only way I can think of is count the total amount of male genders given a specific date, but this would not performant as in real-world examples there could be much more files and manually entering all the dates would be a waste of time. Any help would be appreciated, thank you!
localhost:~# cat file1 file2 | grep "male=yes" | grep "Jan 1" | wc -l
2
CodePudding user response:
grep -h 'male=yes' file? | \
cut -c-6 | \
awk '{c[$0] = 1} END {for(i in c){printf "%6s M\n", i, c[i]}}'
The grep
will print the male lines, cut
will remove everything but the first 6 chars (date) and awk
will count every date and printout every date and the counter in the end.
Given your files the output will be:
Jan 1 2
Jan 3 1
Jan 4 2