Home > Enterprise >  Regex to match a list of one or more comma-separated words, unless the string ends in a comma
Regex to match a list of one or more comma-separated words, unless the string ends in a comma

Time:09-17

I have written the following regex long time ago that must match at least 3 words and works in both Latin and Cyrillic characters : regex = '([^ ,;\d]{2,}[ ,;]{1,}){2,}[^ ,;\d]{2,}'

I would like to rewrite it to match hello but fail to match hello, because of the comma. However, I would still like it to match hello, and, more, words.

Example matches: hello, hello, test69, hello, test69, matches

Example non-matches: hello, hello test69, hello test69 matches

CodePudding user response:

You can use

^\w (?:, *\w )*$

In Python, you can use a shorter version if you use re.fullmatch:

re.fullmatch(r'\w (?:, *\w )*', text)

See the regex demo.

Note that in case your spaces can be any whitespaces, replace the with \s in the regex. If your words can only contain letters, replace each \w with [^\W\d_]. If your words can only contain letters and digits, replace every \w with [^\W_].

Details:

  • ^ - start of string
  • \w - one or more word chars
  • (?:, *\w )* - zero or more repetitions of a comma, zero or more spaces, and then one or more word chars
  • $ - end of string.
  • Related