I have a .tsv file that I would like to filter in Unix .
I want to select the rows that have certain numerical values (e.g 30700, 10600, ... etc) in a particular column.
Thus far, I have seen examples online where rows have been selected based on one particular value in a column. However, in my case, a particular column can have about 20-30 accepted values. How do I go about the subsetting of my data in this case?
CodePudding user response:
awk '{ if ($1 == 1 || $1 == 2) print $0; }'
would do the trick; but nobody gets promoted for writing 40 term if statements; so you might like to consider:
BEGIN { a[1] = a[2] = 1; }
{ if (a[$1]) print $0; }
as a template. Nice thing about awk; it is such a flexible language that there are probably dozens of different ways to approach this. The difficult thing about awk; it is such a flexible language that there are probably dozens of different ways to approach this.