I have a file (my_file) and want to count how many values in column 11 have value < .05:
I try:
echo $($(cat my_file | cut -f 11 | awk '$1 < 5E-2' | wc -l) / $(cat my_file | cut -f 11 | wc -l))
I get 1158532: command not found
Could anyone please help me see where I am wrong?
CodePudding user response:
Consider the string:
$(cat my_file | cut -f 11 | awk '$1 < 5E-2' | wc -l)
The $()
construct is a "command substitution". The commands inside $()
are executed and produce some output. That output is then executed as a command. If the pipelie produces the output "1158532", then bash
will attempt to execute that string as a command. But there is no command 1158532
in your PATH, so you get the error message that you see. You really should just do this whole thing in awk
with something like:
awk '$11 < 0.05 {c } END {printf "%0.4f\n", (float)c / NR}' my_file
To help understand why your command does not work, it might help to consider "fixing" it to be:
expr "$( cat my_file | cut -f 11 | awk '$1 < 5E-2' | wc -l)" / "$(cat my_file | cut -f 11 | wc -l)"
but notice that this will produce 0
or 1
, since the arithmetic is not floating point, but is integers. You could get floating point values by running the data through bc
with:
echo "$( cat my_file | cut -f 11 | awk '$1 < 5E-2' | wc -l)" / "$(cat my_file | cut -f 11 | wc -l)" | bc -l
Note that all of these UUOC should be removed (eg, with < my_file cut -f 11
) and cut | awk
is generally an anti-pattern. Just do the whole thing in awk
.
CodePudding user response:
I think you might be able to handle this all via awk
:
awk 'BEGIN {cnt=0} { if ($11<.05) cnt =1 } END {printf "%2.2f%%\n", cnt/NR*100}' my_file
CodePudding user response:
Using only awk
:
awk '$11 < 0.05 {c } END {print c}' my_file
CodePudding user response:
Here is an example of how to transform parts of your command into shorter equivalents:
cat my_file | cut -f 11 | wc -l
cat my_file | wc -l
wc -l < my_file
cat my_file | cut -f 11 | awk '$1 < 5E-2' | wc -l
cat my_file | awk -F'\t' '$11 < 5E-2' | wc -l
awk -F'\t' '$11 < 5E-2' my_file | wc -l
awk -F'\t' '$11 < 5E-2 {c } END {print c}' my_file
To divide the two results:
awk -F'\t' '$11 < 5E-2 {c } END {print c/NR}' my_file
0.666667