Home > Mobile >  Regex: Extract a part of a string eliminating specific patterns of it
Regex: Extract a part of a string eliminating specific patterns of it

Time:06-21

I have the following example of strings and I would like to only extract the middle part of it eliminating the patterns in front and at the back of each string:

Exp1: Error: -This can be an example. (YT-0E8)

Exp1: Warning: -Warning can happen too (WP-003)

Exp3: Error: Error can happen this way as well. (PP-W28)

I just want to get the following output:

Ans1: -This can be an example.

Ans2: -Warning can happen too

Ans3: Error can happen this way as well.

As you can see, I'm trying to eliminate the FIRST pattern in-front which can be either Error: or Warning: , with a space right after the colon and the SECOND pattern which is the brackets with alphanumeric string and space in-front of it: (TT-T56)

I've come up to this extent to match the front and back pattern, but can't find a way to complete them: ^(Warning: |Error: ).*\((\w -\d )\)$

Any way to solve this?

CodePudding user response:

I would use a regex replacement approach here:

inp = ["Error: -This can be an example. (YT-0E8)", "Warning: -Warning can happen too (WP-003)", "Error: Error can happen this way as well. (PP-W28)"]
output = [re.sub(r'^\w :\s*|\s*\(.*?\)$', '', x) for x in inp]
print(output)
# ['-This can be an example.', '-Warning can happen too',
#  'Error can happen this way as well.']

This approach strips, alternatively, the leading label followed by colon or the trailing term in parentheses, leaving behind the content you want.

  • Related