I have a file that looks like this:
FID IID data1 data2 data3
1 RQ00001-2 1.670339 -0.792363849 -0.634434791
2 RQ00002-0 -0.238737767 -1.036163943 -0.423512414
3 RQ00004-9 -0.363886913 -0.98661685 -0.259951265
3 RQ00004-9 -9 -0.98661685 0.259951265
I want to count the number of positive numbers in column 3 (data 1) versus negative numbers excluding -9. Therefore, for column 3 it will be 1 positive vs 2 negative. I didn't include -9 as this stands for missing data. For data2, this would be 3 negative versus 1 positive. For the last column it will be 3 negative versus 1 positive.
I preferably would like to use awk, but since I am new I need help. I use the command below but this just counts all the - values but I need it to exclude -9. Is there a more sophisticated way of doing this?
awk '$3 ~ /^-/{cnt } END{print cnt}' filename.txt
CodePudding user response:
You can use this awk
solution:
awk -v c=3 '
NR > 1 && $c != -9 {
if ($c < 0)
neg
else
pos
}
END {
printf "Positive: %d, Negative: %d\n", pos, neg
}' file
Positive: 1, Negative: 2
Running it with c=5
:
awk -v c=5 'NR > 1 && $c != -9 {if ($c < 0) neg; else pos} END {printf "Positive: %d, Negative: %d\n", pos, neg}' file
Positive: 1, Negative: 3
CodePudding user response:
Assumptions:
- determine the number of negative and positive values for the 3rd thru Nth columns
One awk
idea:
awk '
NR>1 { for (i=3;i<=NF;i ) {
if ($i == -9) continue
else if ($i < 0) neg[i]
else if ($i > 0) pos[i]
}
}
END { printf "Neg/Pos"
for (i=3;i<=NF;i )
printf "%s%s/%s",OFS,neg[i] 0,pos[i] 0
print ""
}
' filename.txt
This generates:
Neg/Pos 2/1 4/0 3/1
NOTE: OP hasn't provided an example of the expected output; all of the counts are located in the arrays so modifying the output format should be relatively easy once OP has provided a sample output
CodePudding user response:
$ awk '
NR == 1 {
for(i = 3; i <= NF; i ) header[i] = $i
}
NR > 1 {
for(i = 3; i <= NF; i ) {
pos[i] = ($i >= 0); neg[i] = (($i != -9) && ($i < 0))
}
}
END {
for(i in pos) {
if (header[i] == "") header[i] = "column " i
printf("%-10s: %d positive, %d negative\n", header[i], pos[i], neg[i])
}
}' file
data1 : 1 positive, 2 negative
data2 : 0 positive, 4 negative
data3 : 1 positive, 3 negative
CodePudding user response:
awk '
NR > 1 && $3 != -9 {$3 >= 0 ? p : n}
END {print "pos: "p 0, "neg: "n 0}'
Gives:
pos: 1 neg: 2
You can change n
to --p
to get a single number p
, equal to number of positive minus number of negative.
CodePudding user response:
Below you find some examples how you can achieve this:
Note: we assume that -0.0
and 0.0
are positive.
Count negative numbers in column n
:
$ awk '(FNR>1){c =($n<0)}END{print "pos:",(NR-1-c),"neg:"c 0}' file
Count negative numbers in column n
, but ignore -9
:
$ awk '(FNR>1){c =($n<0);d =($n==-9)}END{print "pos:",(NR-1-c-2*d),"neg:"c-d}' file
Count negative numbers columns m to n:
$ awk '(FNR>1){for(i=m;i<=n; i) c[i] =($i<0)}
END{for(i=m;i<=n; i) print i,"pos:",(NR-1-c[i]),"neg:"c[i] 0}' file
Count negative numbers in columns m to n, but ignore -9
:
$ awk '(FNR>1){for(i=m;i<=n; i) {c =($i<0);d =($i==-9)}}
END{for(i=m;i<=n; i) print i,"pos:",(NR-1-c[i]-2*d[i]),"neg:"c[i]-d[i]}' file