I want to create a regex in python that find words that start with @ or @.
I have created the following regex, but the output contains one extra space in each string as you can see
regex = r'\s@\/?[\w\.\-]{2,}'
exp = 'george want@to play @.hdgska football @dddada'
re.findall(regex, exp)
Output: [' @.hdgska', ' @dddada']
However, the output that I want to have is the following
Output: ['@.hdgska', '@dddada']
I would be grateful if you could help me!
CodePudding user response:
In your pattern you are actually matching the leading \s
and after the @ there can be an optional /
with \/?
but it should optionally start with a dot.
You could match for example an optional dot, and then 2 or more times the allowed characters in the character class.
At the left of the @ sign, either assert a non word boundary or assert a whitespace boundary.
Note that you don't have to escape the dot and the hyphen in the character class.
\B@\.?[\w.-]{2,}
Another option:
(?<!\S)@\.?[\w.-]{2,}
Example
import re
pattern = r"(?<!\S)@\.?\w [\w.-]{2,}"
s = "george want@to play @.hdgska football @dddada"
print(re.findall(pattern, s))
Output
['@.hdgska', '@dddada']