Remove repeating words in column, based on another column-CodePudding

I have got pandas DataFrame as below:

First Column	Second Column
Dog	Dog is good
Big Cat	Big cat is here
Fat rat	Fat rat is there
Pink tree	Pink tree means love

I want to remove repeating word in second column based on first column. My desired output is:

First Column	Second Column
Dog	is good
Big Cat	is here
Fat rat	is there
Pink tree	means love

How can i achieve it?

I have looked around here, but could not find solution which would suite me.

Thanks!

CodePudding user response：

Try using row-wise apply with axis=1:

df['Second Column'] = df.apply(lambda x: x['Second Column'].lower().replace(x['First Column'].lower(), ''), axis=1)

>>> df
  First Column Second Column
0          Dog       is good
1      Big Cat       is here
2      Fat rat      is there
3    Pink tree    means love
>>>

CodePudding user response：

Instead of spoon-feeding you a solution (there are several), i'd let you know how I'd go about solving this in the simplest way I can think of. Provided the same pattern repeats throughout the entire dataset, and there are no anomalies (like stray whitespaces); a solution, IMO, could be to extract a substring (ranged-slice) from "Second Column" with an offset equal to the length of the element in "First Column" 1 (to account for the whitespace in "Second Column"), to the end.

**A caveat: This might not be the most "Pandas"-esque solution out there.