Using Pandas to dynamically replace values found in other columns-CodePudding

I have a dataset looks like this:

Car	Make	Engine
Toyota Rav 4 8cyl6L	Toyota	8cyl6L
Mitsubishi Eclipse 2.1T	Mitsubishi	2.1T
Monster Gravedigger 25Lsc	Monster	25Lsc

The data was clearly concatenated from Make Model Engine at some point but the car Model was not provided to me.

I've been trying to use Pandas to say that if we take Car, replace instances of Make with a nothing, replace instances of Engine with nothing, then trim the spaces around the result, we will have Model.

Car	Make	Model	Engine
Toyota Rav 4 8cyl6L	Toyota	Rav 4	8cyl6L
Mitsubishi Eclipse 2.1T	Mitsubishi	Eclipse	2.1T
Monster Gravedigger 25Lsc	Monster	Gravedigger	25Lsc

There's something I'm doing wrong when I'm trying to reference another column in this manner.

df['Model'] = df['Car'].str.replace(df['Make'],'')

gives me an error of "unhashable type: 'Series'". I'm guessing I'm accidentally inputting the entire 'Make' column.

At every row I want to make a different substitution using data from other columns in that row. How would I accomplish this?

CodePudding user response：

you can use:

df['Model']=df.apply(lambda x: x['Car'].replace(x['Make'],"").replace(x['Engine'],""),axis=1)
print(df)
'''
                         Car        Make          Model  Engine
0        Toyota Rav 4 8cyl6L      Toyota         Rav 4   8cyl6L
1    Mitsubishi Eclipse 2.1T  Mitsubishi       Eclipse     2.1T
2  Monster Gravedigger 25Lsc     Monster   Gravedigger    25Lsc
'''

CodePudding user response：

A regex proposition using re.sub :

import re

df['Model'] = [re.sub(f'{b}|{c}', '', a) for a,b,c in zip(df['Car'], df['Make'], df["Engine"])]

# Output :

print(df)

                         Car        Make          Model  Engine
0        Toyota Rav 4 8cyl6L      Toyota         Rav 4   8cyl6L
1    Mitsubishi Eclipse 2.1T  Mitsubishi       Eclipse     2.1T
2  Monster Gravedigger 25Lsc     Monster   Gravedigger    25Lsc