I have a dataframe that consists of sentences I want to delete specific sentence in the dataframe if the sentence start with specific match that
df['data']=["First: This is the sentence good mode:one line","Second: This sentence is also good mode:one line","Third: this sentence is too long mode:two lines"]
I would like to remove the words starting from mode until the end including the mode. Expected result
df['data']=["First: This is the sentence good","Second: This sentence is also good","Third: this sentence is too long"]
This is what I tried
unwanted_list=["mode: one line"]
df['data'].str.replace(unwanted_list, '', regex=True)
The result it remove one line but mode still there, I would like to remove mode: one line
expected output
df['data']=["First: This is the sentence good","Second: This sentence is also good ","Third: this sentence is too long"]
CodePudding user response:
IIUC, use str.replace
with the \s*\bmode:.*
regex.
df['data'] = df['data'].str.replace(r'\s*\bmode:.*', '', regex=True)
output:
data
0 First: This is the sentence good
1 Second: This sentence is also good
2 Third: this sentence is too long
CodePudding user response:
Try:
df['data'] = df['data'].str.extract(r'(.*?)\s mode:')
Prints:
data
0 First: This is the sentence good
1 Second: This sentence is also good
2 Third: this sentence is too long