I have a tsv file with different column number
1 123 123 a b c
1 123 b c
1 345 345 a b c
I would like to extract only rows with 6 columns
1 123 123 a b c
1 345 345 a b c
How I can do that in bash (awk, sed or something else) ?
CodePudding user response:
Using Awk
$ awk -F'\t' 'NF==6' file
1 123 123 a b c
1 345 345 a b c
CodePudding user response:
Using GNU sed
let file.txt
content be
1 123 123 a b c
1 123 b c
1 345 345 a b c
1 777 777 a b c d
then
sed -n '/^[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*$/p' file.txt
gives output
1 123 123 a b c
1 345 345 a b c
Explanation: -n
turn off default printing, sole action is to print (p
) line matching pattern which is begin (^
) and end ($
) anchored consisting of 6 column of non-TABs separated by single TABs. This code does use very basic features sed but as you might observe is longer than AWK and not as easy in adjusting N.
(tested in GNU sed 4.2.2)
CodePudding user response:
This might work for you (GNU sed):
sed -nE 's/\S /&/6p' file
This will print lines with 6 or more fields.
sed -nE 's/\S /&/6;T;s//&/7;t;p' file
This will print lines with only 6 fields.