I want to replace multiple column values if a certain queried column is below a certain value.
Example file test:
cat test
OTU Phy P.conf Class C.conf Ord ord.conf Spec S.conf
1 Mollusca 90 Bivalvia 80 Venerida 80 Rangia 80
2 Chordata 88 Fish 20 Salmon 0 pink 0
3 Cnidaria 100 Coral 78 fire 22 octo 12
Basically, I'd like to make taxonomic names "NA" if the confidence values are below a certain value.
I have tried this:
cat test | awk ' $3<90 {$2="NA"}1'
OTU Phy P.conf Class C.conf Ord ord.conf Spec S.conf
1 Mollusca 90 Bivalvia 80 Venerida 80 Rangia 80
2 NA 88 Fish 20 Salmon 0 pink 0
3 Cnidaria 100 Coral 78 fire 22 octo 12
Which changes the Phylum name (Phy column) for the 2nd row to NA because the Phy confidence column (column 3) was below 90.
What Id like to do is if the value in column 3 falls below 90, then I want to change additional columns to the right (lower taxonomic levels) to NA as well: e.g.
OTU Phy P.conf Class C.conf Ord ord.conf Spec S.conf
1 Mollusca 90 Bivalvia 80 Venerida 80 Rangia 80
2 NA 88 NA 20 NA 0 NA 0
3 Cnidaria 100 Coral 78 fire 22 octo 12
I thought this would be pretty easy, but how to change multiple columns when the 1st condition is met?
Thanks for any help.
LP
CodePudding user response:
You can use this awk
that filters a row when $3 < 90
and changes each non-numeric field to NA
:
awk '$3 < 90 {for (i=2; i<=NF; i) if ($i 0 != $i) $i = "NA"} 1' file
OTU Phy P.conf Class C.conf Ord ord.conf Spec S.conf
1 Mollusca 90 Bivalvia 80 Venerida 80 Rangia 80
2 NA 88 NA 20 NA 0 NA 0
3 Cnidaria 100 Coral 78 fire 22 octo 12