I have this sample string in the variable test_string. I want to delete chorus text. It is situated in this format: [chorus etc..] chorus_text [verse etc..]. I tried some regex patterns in python but nothing. Any idea? The pattern I am providing doesn't give me what i need. The chorus_text includes punctuation and characters.
test_string = """Cordae,RNP,2019.0,"RNP Lyrics[Chorus: Anderson .Paak] text text text text text text te'all tean' tean' text text text [Verse 1: YBN Cordae]."""
pattern = re.compile(r"[^\[chorus\]$][^\[verse\]]")
subbed_chorus_before_verse = pattern.findall(test_string)
CodePudding user response:
test_string = """Cordae,RNP,2019.0,"RNP Lyrics[Chorus: Anderson .Paak] text text text text text text te'all tean' tean' text text text [Verse 1: YBN Cordae]."""
re.sub('(?<=\])(.*)(?=\[)','',test_string)
# (?<=\]) : positive lookbehind for ]
# (?=\[ : positive look ahead for [
'Cordae,RNP,2019.0,"RNP Lyrics[Chorus: Anderson .Paak][Verse 1: YBN Cordae].'
CodePudding user response:
You can use
re.sub(r'\[Chorus:[^][]*][^[]*', '', test_string)
See demo.
Details:
\[Chorus:
- a[Chorus:
string[^][]*
- zero or more chars other than]
and[
]
- a]
char[^[]*
- zero or more chars other than[
.