I'm trying to implement a REGEX in perl that matches if the string contains anything I specify, but it must fail if the string contains anything else even if it contains my regex elements in addition. For instance:
if ($string !~ m/[ACTG]{1,}/) {
die "invalid sequence";
}
If I input DQJ
as s sequence, the program dies. However, if I input ACH
it doesn't since it contains at least one element from the regex pattern.
I would like the match to fail if the string contains anything other than A, C, T or G.
PS: I am unfamiliar with the nomenclature in this case, what would I call what is inside // in the regex statement?
CodePudding user response:
You can make sure every character is a valid character.
die if !/^[ACTG]*\z/;
Using the faster tr///:
die if tr/ACTG// != length($_);
Or you can make sure the string doesn't contain an invalid character.
die if /[^ACTG]/;
Using the faster tr///:
die if tr/ACTG//c;
Thanks to @DavidO for suggesting tr///.
CodePudding user response:
Use the ^
character to define the beginning of the string
and the $
to define the end.
if ($string !~ /^[ACTG]*$/){die "invalid sequence";}