i would like generate the following output from the string "[cid:12d32323232dde]foo foo foo \r\n\r\n\r\n[cid:123fsr3ef234fsdfere]\r\n"
expected output
foo foo foo \r\n\r\n\r\n
CodePudding user response:
So - remove all [cid:...]
blocks and any newlines/carriage-returns trailing them?
>>> import re
>>> s = "[cid:12d32323232dde]foo foo foo \r\n\r\n\r\n[cid:123fsr3ef234fsdfere]\r\n"
>>> re.sub(r"\[cid:(. ?)\][\r\n]*", "", s)
'foo foo foo \r\n\r\n\r\n'
CodePudding user response:
You can try this regex search
import regex as re
x = r"[cid:12d32323232dde]foo foo foo \r\n\r\n\r\n[cid:123fsr3ef234fsdfere]\r\n"
re.search("\]([^]] )\[", x)[1]
First we will import the regex
We will make the string - raw string
x = r"" -> the r before the string -> we will get the next result with raw string foo foo foo \r\n\r\n\r\n -> we will get the next result without raw string foo foo foo ...
We do a regex search to get text between ] and [, the re.search method return a match object, the match object contains 2 items.
re.search("\]([^]] )\[", x)[0] -> first one with ] and [ re.search("\]([^]] )\[", x)[1] -> second one without ] and [