i want to devide a sentence into words using regex, i'm using this code:
import re
sentence='<30>Jan 11 11:45:50 test-tt systemd[1]: tester-test.service: activation successfully.'
sentence = re.split('\s|,|>|<|\[|\]:', sentence)
but i'm getting not what i'm waiting for
expected output is :
['30', 'Jan', '11', '11:45:50', 'test-tt', 'systemd', '1', 'tester-test.service: activation successfully.']
but what i'm getting is :
['', '30', 'Jan', '11', '11:45:50', 'test-tt', 'systemd', '1', '', 'tester-test.service:', 'activation', 'successfully.']
i tried actually to ingnore the whitespace but actually it should be ignored only in the last long-word and i have no idea how can i do that.. any suggestions/help Thank you in advance
CodePudding user response:
You can use
import re
sentence='<30>Jan 11 11:45:50 test-tt systemd[1]: tester-test.service: activation successfully.'
chunks = sentence.split(': ', 1)
result = re.findall(r'[^][\s,<>] ', chunks[0])
result.append(chunks[1])
print(result)
# => ['30', 'Jan', '11', '11:45:50', 'test-tt', 'systemd', '1', 'tester-test.service: activation successfully.']
See the Python demo
Here,
chunks = sentence.split(': ', 1)
- splits the sentence into two chunks with the first:
substringresult = re.findall(r'[^][\s,<>] ', chunks[0])
- extracts all substrings consisting of one or more chars other than]
,[
, whitespace,,
,<
and>
chars from the first chunkresult.append(chunks[1])
- append the second chunk to theresult
list.