I have written the following regex long time ago that must match at least 3 words and works in both Latin and Cyrillic characters : regex = '([^ ,;\d]{2,}[ ,;]{1,}){2,}[^ ,;\d]{2,}'
I would like to rewrite it to match hello
but fail to match hello,
because of the comma. However, I would still like it to match hello, and, more, words
.
Example matches: hello
, hello, test69
, hello, test69, matches
Example non-matches: hello,
hello test69
, hello test69 matches
CodePudding user response:
You can use
^\w (?:, *\w )*$
In Python, you can use a shorter version if you use re.fullmatch
:
re.fullmatch(r'\w (?:, *\w )*', text)
See the regex demo.
Note that in case your spaces can be any whitespaces, replace the
with \s
in the regex. If your words can only contain letters, replace each \w
with [^\W\d_]
. If your words can only contain letters and digits, replace every \w
with [^\W_]
.
Details:
^
- start of string\w
- one or more word chars(?:, *\w )*
- zero or more repetitions of a comma, zero or more spaces, and then one or more word chars$
- end of string.