Is there a way to count the number of records in 1st column of a file using awk ??
My file :-
abc|87123
cdb|
fgytw|23321
ghft|
|87635
expected output: 4
I tried below command but its not working:
awk -F'|' 'NF==$1{c }END {print c}' file
CodePudding user response:
You can use
awk -F\| 'length($1){c } END{print c}'
See the online demo:
#!/bin/bash
s='abc|87123
cdb|
fgytw|23321
ghft|
|87635'
awk -F\| 'length($1){c } END{print c}' <<< "$s"
# => 4
That is, the c
is only incremented if Field 1 length is greater than zero.
CodePudding user response:
$ awk -F'|' '$1 != ""{c } END{print c 0}' file
4
You need the 0
at the end to get numeric 0
output instead of a blank line when no lines match the condition.
CodePudding user response:
1st solution: With your shown samples, please try following awk
code. Simple explanation would be, this will check if 1st field is NOT empty(not space) and having length then count that field and keep doing this for whole Input_file then in END
block of awk
code print that total number of matches found.
awk -F'|' '$1!~/[[:space:]]/ && length($1){count } END{print count}' Input_file
NOTE: Also change from [[:space:]]
to [[:blank:]]
in case you may have spaces OR Tabs also in first columns.
2nd solution: Using GNU grep
wc
combination in this solution.
grep -oP '^\S \|' Input_file | wc -l
3rd solution: As per suggestion in comments by RARE kpoop Manifesto one could try following also.
awk -F'^[[:space:]]*[|]' '{ count = NF == 1 } END { print count}' Input_file
CodePudding user response:
What about this:
echo $(( $(cat test.txt | wc -l) - $(grep "^|" test.txt | wc -l) ))
To give you an idea what it means:
cat test.txt | wc -l
This counts the amount of lines in the entire file. Don't use wc -l test.txt
because this also outputs the name of the file, which you don't need.
grep "^|" test.txt | wc -l
That's a neat trick: ^
means "starting of line". When it gets followed by a column separator, then it means that the first column is not filled in. So, grep "^|" test.txt | wc -l
gives the amount of lines where the first column is not filled in.
Now, how to combine both? Well, simply using $((4-1))
, which performs an integer calculation.
I admit, it looks nasty, but it does the job! :-)
CodePudding user response:
Another awk
solution:
awk '/^[^|]/{ c} END {print c}' file
4
CodePudding user response:
$ wc -l < <(sed '/^|/d' file)
4
$ sed '/^|/d' file|sed -n '$='
4
$ grep -c "^[^|]" file
4
CodePudding user response:
keep it simple - 3 ways of saying the same hting:
{m,g}awk '{ _ = NF } END { print NR-_ NR }' FS='^[|]'
{m,g}awk '{ _ =!__~NF } END { print _ }' FS='^[|]'
{m,g}awk '{ _ =/^\|/ } END { print NR-_ }' FS='^$'
4
If you don't mind to loading the file all at once, then even easier :
- single subtraction gsub()
- no tracking needed
- input rows become "fields" in this context
.
{m,g}awk '$!NF = NF - gsub("(^|\n)[|]|\n$","&")' FS='\n' RS='^$'
4
or if you wanna do it reversed order (admittedly, overkill for the task) : .
{m,g}awk '$!NF= gsub("[^|] ","&", $!(NF = NF))' RS='^$' \
OFS='|' FS='[|]([^|\n]*[|])*[^|\n]*\n' OFS='|'
4