Home > Software design >  how do i remove characters between 2 different characters inside a string
how do i remove characters between 2 different characters inside a string

Time:11-23

So i have this inside a text file :

"00:00:25,58 --> 00:00:27,91 (DRAMATIC MUSIC PLAYING)"

I want to remove characters inside and including the braces itself so :

"00:00:25,58 --> 00:00:27,91 "


eng_sub = open(text).read()
eng_sub2 = re.sub("\(", "", eng_sub)
new_eng_sub = re.sub("\)", "", eng_sub2)

open(text, "w").write(new_eng_sub)

I've tried using sub() and it removes a character but what i really want to do is manipulate characters between those 2 (i.e. "(" , ")") characters.

I don't know how to do it. thank you for your help.

CodePudding user response:

You may try matching on the pattern \(.*?\):

eng_sub = open(text).read()
eng_sub2 = re.sub(r'\(.*?\)', '', eng_sub)

open(text, "w").write(eng_sub2)

CodePudding user response:

Indeed, you can't use the "sub" method which will simply delete the pattern. But what you can do (and which is not too complex) is to use "findall" (also present in the re library) which allows you to extract a pattern in the STR Here's a simple example :

import re
text = "00:00:25,58 --> 00:00:27,91 (DRAMATIC MUSIC PLAYING)"
print(re.findall(r"\(.*\)", text)[0])

Output: (DRAMATIC MUSIC PLAYING)

Once you have extracted what you want to manipulate, you can delete this pattern via sub

print(re.sub(r"\(.*\)", '', text))
  • Related