Home > Enterprise >  Detect strings containing only digits, letters and one or more question marks
Detect strings containing only digits, letters and one or more question marks

Time:06-07

I am writing a python regex that matches only string that consists of letters, digits and one or more question marks.

For example, regex1: ^[A-Za-z0-9?] $ returns strings with or without ?

I want a regex2 that matches expressions such as ABC123?A, 1AB?CA?, ?2ABCD, ???, 123? but not ABC123, ABC.?1D1, ABC(a)?1d on mysql, I did that and it works:

select *
from (
select * from norm_prod.skill_patterns
where pattern REGEXP '^[A-Za-z0-9?] $') AS XXX
where XXX.pattern not REGEXP '^[A-Za-z0-9] $'

CodePudding user response:

How about something like this:

^(?=.*\?)[a-zA-Z0-9\?] $

As you can see here at regex101.com

Explanation

The (?=.*\?) is a positive lookahead that tells the regex that the start of the match should be followed by 0 or more characters and then a ? - i.e., there should be a ? somewhere in the match.
The [a-zA-Z0-9\?] matches one-or-more occurrences of the characters given in the character class i.e. a-z, A-Z and digits from 0-9, and the question mark ?.

Altogether, the regex first checks if there is a question mark somewhere in the string to be matched. If yes, then it matches the characters mentioned above. If either the ? is not present, or there is some foreign character, then the string is not matched.

CodePudding user response:

You can validate an alphanumeric string with one or more question marks using

where pattern REGEXP '^[A-Za-z0-9]*([?][A-Za-z0-9]*) $'

In Python:

re.search(r'^[A-Za-z0-9]*(?:\?[A-Za-z0-9]*) $', text)

See the regex demo.

Details:

  • ^ - start of string
  • [A-Za-z0-9]* - zero or more letters or digits
  • ([?][A-Za-z0-9]*) - one or more repetitions of a ? char and then zero or more letters or digits
  • $ - end of string.

If you plan to apply this to any Unicode string, consider using POSIX character classes:

where pattern REGEXP '^[[:alnum:]]*([?][[:alnum:]]*) $'

where [[:alnum:]] matches any letters and digits. In Python:

re.search(r'^[^\W_]*(?:\?[^\W_]*) $', text)

In Python, all shorthand character classes are Unicode aware by default, and the [^\W_] pattern is a \w (that matches letters, digits, connector punctuation) with _ subtracted from it.

CodePudding user response:

If there should be at least a single question mark present using MySQL or Python:

^[A-Za-z0-9]*\?[A-Za-z0-9?]*$

Explanation

  • ^ Start of string
  • [A-Za-z0-9]* Match optional chars A-Z a-z 0-9
  • \? Match a question mark
  • [A-Za-z0-9]* Match optional chars A-Z a-z 0-9 or ?
  • $ End of string

See a regex demo.

In MySQL double escape the backslash like:

REGEXP '^[A-Za-z0-9]*\\?[A-Za-z0-9?]*$'
  • Related