Home > Back-end >  How to extract and delete text between two titles: Chorus and Verse Using regex
How to extract and delete text between two titles: Chorus and Verse Using regex

Time:10-23

I have this sample string in the variable test_string. I want to delete chorus text. It is situated in this format: [chorus etc..] chorus_text [verse etc..]. I tried some regex patterns in python but nothing. Any idea? The pattern I am providing doesn't give me what i need. The chorus_text includes punctuation and characters.

test_string = """Cordae,RNP,2019.0,"RNP Lyrics[Chorus: Anderson .Paak] text text text text text text te'all tean' tean' text text text [Verse 1: YBN Cordae]."""

pattern = re.compile(r"[^\[chorus\]$][^\[verse\]]")
subbed_chorus_before_verse = pattern.findall(test_string)

CodePudding user response:

test_string = """Cordae,RNP,2019.0,"RNP Lyrics[Chorus: Anderson .Paak] text text text text text text te'all tean' tean' text text text [Verse 1: YBN Cordae]."""

re.sub('(?<=\])(.*)(?=\[)','',test_string)

# (?<=\]) : positive lookbehind for ]
# (?=\[ : positive look ahead for [
'Cordae,RNP,2019.0,"RNP Lyrics[Chorus: Anderson .Paak][Verse 1: YBN Cordae].'

CodePudding user response:

You can use

re.sub(r'\[Chorus:[^][]*][^[]*', '', test_string)

See demo.

Details:

  • \[Chorus: - a [Chorus: string
  • [^][]* - zero or more chars other than ] and [
  • ] - a ] char
  • [^[]* - zero or more chars other than [.
  • Related