Home > Back-end >  UNIX: How to count number of rows in multiple files without headers
UNIX: How to count number of rows in multiple files without headers

Time:11-19

I have a set of files with similar naming pattern. I am trying to get the total row count of all the files combined sans the header in a go. But I am having trouble with the commands.

I have tried:

sed '1d' IN-Pass-30* | wc -l

and

awk 'END {print NR-1}' IN-Pass-30*

But each time it only subtracts the header count from just one file. What am I doing wrong here?

CodePudding user response:

You were close. Wrap the sed command in a bash glob loop:

for f in IN-Pass-30*; do sed '1d' "$f"; done | wc -l

CodePudding user response:

I propose following "simple" solution:

Prompt> find ./ -maxdepth 1 -name "IN-Pass-30*" | wc -l
53
Prompt> cat IN-Pass-30* | wc -l
1418549
Prompt> echo $(($(cat IN-Pass-30* | wc -l) - $(find ./ -maxdepth 1 -name "IN-Pass-30*" | wc -l)))
1418496

What does this mean?

Prompt> find ./ -maxdepth 1 -name "IN-Pass-30*" | wc -l
// find all files inside that directory without checking subdirectories.
// once they are found, count them.

Prompt> cat IN-Pass-30* | wc -l
// use `cat` to concatenate all files' content.
// at the end, count the amount of lines.

Prompt> echo $$(a - b))
// calculate the difference between a and b.

Prompt> echo $(command)
// show (or do whatever with it) the result of a command

Oh, the whole idea is that a header takes 1 line per file, so by counting the amount of lines in all the files, subtracted by the amount of files (which is the same as the amount of header lines), you should get the desired result.

  •  Tags:  
  • unix
  • Related