I have this in my file
import re
sample = """Name: @s
Owner: @a[tag=Admin]"""
target = r"@[sae](\[[\w{}=, ]*\])?"
regex = re.split(target, sample)
print(regex)
I want to split all words that start with @
, so like this:
["Name: ", "@s", "\nOwner: ", "@a[tag=Admin]"]
But instead it give this:
['Name: ', None, '\nOwner: ', '[tag=Admin]', '']
How to seperating it?
CodePudding user response:
I would use re.findall
here:
sample = """Name: @s
Owner: @a[tag=Admin]"""
parts = re.findall(r'@\w (?:\[.*?\])?|\s*\S \s*', sample)
print(parts) # ['Name: ', '@s', '\nOwner: ', '@a[tag=Admin]']
The regex pattern used here says to match:
@\w a tag @some_tag
(?:\[.*?\])? followed by an optional [...] term
| OR
\s*\S \s* any other non whitespace term,
including optional whitespace on both sides
CodePudding user response:
If I understand the requirements correctly you could do that as follows:
import re
s = """Name: @s
Owner: @a[tag=Admin]
"""
rgx = r'(?=@.*)|(?=\r?\n[^@\r\n]*)'
re.split(rgx, s)
#=> ['Name: ', '@s', '\nOwner: ', '@a[tag=Admin]\n']
The regular expression can be broken down as follows.
(?= # begin a positive lookahead
@.* # match '@' followed by >= 0 chars other than line terminators
) # end positive lookahead
| # or
(?= # begin a positive lookahead
\r?\n # match a line terminator
[^@\r\n]* # match >= 0 characters other than '@' and line terminators
) # end positive lookahead
Notice that matches are zero-width.