Command to find exact digit matching in linux?-CodePudding

I have a dataset of fasta file which looks like this :

13_seq2344_ATCGACGGAACTGA
1342_seq2134_AGCTGTGGCAT
130_SEQ2289_TCGAATCGAGGAAC

I want to remove the line which contains "13" only

so My output should look like :

1342_seq2134_AGCTGTGGCAT
130_SEQ2289_TCGAATCGAGGAAC

I am trying grep -w , grep -o, grep -E all these are not working for me . grep -o "13" filename

do suggest any command that works .

CodePudding user response：

With your shown samples, please try following awk code. Simple explanation would be, if 1st field of your fasta file is NOT 13 then print that line. In awk program making field separator as _ and checking if $1(first field) is NOT 13 then print that line.

awk -F'_' '$1!="13"' Input_file

CodePudding user response：

If the file should not contain the number 13 anywhere in the string, you can match 13 without digits to the left and right and use -v to invert the match.

The -P is used for the lookarounds enabling a Perl-compatible regular expression.

grep -vP '(?<!\d)13(?!\d)' file

Or assert that the string does not contain 13 not being surrounded by digits:

grep -P '^(?!.*(?<!\d)13(?!\d))' file

Output

1342_seq2134_AGCTGTGGCAT
130_SEQ2289_TCGAATCGAGGAAC