I have an data frame like this
details
Pinene 0.16%, Borneol 0.08%, Myrcene 0.12%,Total terpenes content 1.00%, Parents Strains Kandy KushCookie Monster
Pinene 0.18%, Borneol 0.08%, Myrcene 0.2%,Total terpenes content 05.00%, Parents Strains Kandy KushCookie Monster
I want to remove everything after Total terpenes content. so my expected data frame will be look like this:
details
Pinene 0.16%, Borneol 0.08%, Myrcene 0.12%,Total terpenes content 1.00%
Pinene 0.18%, Borneol 0.08%, Myrcene 0.2%,Total terpenes content 05.00%
CodePudding user response:
here is one way to do it
# using regex extract everything prior to 'Total terpenes content' and until
# positive lookahead of ","
# and assign back to details column
df['details']=df['details'].str.extract(r'(.*Total terpenes content.*(?=,))' )
df
0 Pinene 0.16%, Borneol 0.08%, Myrcene 0.12%,Tot...
1 Pinene 0.18%, Borneol 0.08%, Myrcene 0.2%,Tota...
Name: details, dtype: object