Home > Net >  removing a particular pattern from a given string in python
removing a particular pattern from a given string in python

Time:03-08

i would like generate the following output from the string "[cid:12d32323232dde]foo foo foo \r\n\r\n\r\n[cid:123fsr3ef234fsdfere]\r\n"

expected output

foo foo foo \r\n\r\n\r\n

CodePudding user response:

So - remove all [cid:...] blocks and any newlines/carriage-returns trailing them?

>>> import re
>>> s = "[cid:12d32323232dde]foo foo foo \r\n\r\n\r\n[cid:123fsr3ef234fsdfere]\r\n"
>>> re.sub(r"\[cid:(. ?)\][\r\n]*", "", s)
'foo foo foo \r\n\r\n\r\n'

CodePudding user response:

You can try this regex search

import regex as re

x = r"[cid:12d32323232dde]foo foo foo \r\n\r\n\r\n[cid:123fsr3ef234fsdfere]\r\n"

re.search("\]([^]] )\[", x)[1]
  1. First we will import the regex

  2. We will make the string - raw string

    x = r"" -> the r before the string
    
    -> we will get the next result with raw string
    foo foo foo \r\n\r\n\r\n
    
    -> we will get the next result without raw string
    foo foo foo 
    ...
    
  3. We do a regex search to get text between ] and [, the re.search method return a match object, the match object contains 2 items.

    re.search("\]([^]] )\[", x)[0]
    -> first one with ] and [
    
    re.search("\]([^]] )\[", x)[1]
    -> second one without ] and [
    
  • Related