I am writing a python regex that matches only string that consists of letters, digits and one or more question marks.
For example, regex1: ^[A-Za-z0-9?] $
returns strings with or without ?
I want a regex2 that matches expressions such as ABC123?A, 1AB?CA?, ?2ABCD, ???, 123?
but not ABC123, ABC.?1D1, ABC(a)?1d
on mysql, I did that and it works:
select *
from (
select * from norm_prod.skill_patterns
where pattern REGEXP '^[A-Za-z0-9?] $') AS XXX
where XXX.pattern not REGEXP '^[A-Za-z0-9] $'
CodePudding user response:
How about something like this:
^(?=.*\?)[a-zA-Z0-9\?] $
As you can see here at regex101.com
Explanation
The (?=.*\?)
is a positive lookahead that tells the regex that the start of the match should be followed by 0 or more characters and then a ?
- i.e., there should be a ?
somewhere in the match.
The [a-zA-Z0-9\?]
matches one-or-more occurrences of the characters given in the character class i.e. a-z
, A-Z
and digits from 0-9
, and the question mark ?
.
Altogether, the regex first checks if there is a question mark somewhere in the string to be matched. If yes, then it matches the characters mentioned above. If either the ?
is not present, or there is some foreign character, then the string is not matched.
CodePudding user response:
You can validate an alphanumeric string with one or more question marks using
where pattern REGEXP '^[A-Za-z0-9]*([?][A-Za-z0-9]*) $'
In Python:
re.search(r'^[A-Za-z0-9]*(?:\?[A-Za-z0-9]*) $', text)
See the regex demo.
Details:
^
- start of string[A-Za-z0-9]*
- zero or more letters or digits([?][A-Za-z0-9]*)
- one or more repetitions of a?
char and then zero or more letters or digits$
- end of string.
If you plan to apply this to any Unicode string, consider using POSIX character classes:
where pattern REGEXP '^[[:alnum:]]*([?][[:alnum:]]*) $'
where [[:alnum:]]
matches any letters and digits. In Python:
re.search(r'^[^\W_]*(?:\?[^\W_]*) $', text)
In Python, all shorthand character classes are Unicode aware by default, and the [^\W_]
pattern is a \w
(that matches letters, digits, connector punctuation) with _
subtracted from it.
CodePudding user response:
If there should be at least a single question mark present using MySQL or Python:
^[A-Za-z0-9]*\?[A-Za-z0-9?]*$
Explanation
^
Start of string[A-Za-z0-9]*
Match optional chars A-Z a-z 0-9\?
Match a question mark[A-Za-z0-9]*
Match optional chars A-Z a-z 0-9 or ?$
End of string
See a regex demo.
In MySQL double escape the backslash like:
REGEXP '^[A-Za-z0-9]*\\?[A-Za-z0-9?]*$'