I have the following example of strings and I would like to only extract the middle part of it eliminating the patterns in front and at the back of each string:
Exp1: Error: -This can be an example. (YT-0E8)
Exp1: Warning: -Warning can happen too (WP-003)
Exp3: Error: Error can happen this way as well. (PP-W28)
I just want to get the following output:
Ans1: -This can be an example.
Ans2: -Warning can happen too
Ans3: Error can happen this way as well.
As you can see, I'm trying to eliminate the FIRST pattern in-front which can be either Error:
or Warning:
, with a space right after the colon and the SECOND pattern which is the brackets with alphanumeric string and space in-front of it: (TT-T56)
I've come up to this extent to match the front and back pattern, but can't find a way to complete them: ^(Warning: |Error: ).*\((\w -\d )\)$
Any way to solve this?
CodePudding user response:
I would use a regex replacement approach here:
inp = ["Error: -This can be an example. (YT-0E8)", "Warning: -Warning can happen too (WP-003)", "Error: Error can happen this way as well. (PP-W28)"]
output = [re.sub(r'^\w :\s*|\s*\(.*?\)$', '', x) for x in inp]
print(output)
# ['-This can be an example.', '-Warning can happen too',
# 'Error can happen this way as well.']
This approach strips, alternatively, the leading label followed by colon or the trailing term in parentheses, leaving behind the content you want.