Home > database >  How to remove rows using another dataframe
How to remove rows using another dataframe

Time:10-19

I have a dataframe where I have my complete dataset, it has 4 columns, but I need to remove all rows that have the customer_id from another dataframe

Here's an example:

df1 = pd.DataFrame({'client_id' : ['123456', '451245', '563234', '753214', '412456', '442312', '123411', '987654'],
                'name_fran' : [    'will',     'bruce',     'john',     'jorge',     'well',     'dani',     'rafa',     'manoel'],
                'id_fran' : ['331'  ,   '221',   '312',   '531',   '678',   '214',   '456',   '886']})

df2 = pd.DataFrame({'client_id' : ['123456', '987654']})

df1 :

index client_id name_fran id_fran
0 123456 will 331
1 451245 bruce 221
2 563234 john 312
3 753214 jorge 531
4 412456 well 678
5 442312 dani 214
6 123411 rafa 456
7 987654 manoel 886

df2:

index client_id
0 123456
1 987654

result I need is this:

index client_id name_fran id_fran
0 451245 bruce 221
1 563234 john 312
2 753214 jorge 531
3 412456 well 678
4 442312 dani 214
5 123411 rafa 456

I've tried a few ways but got no results, what's the best way to do this?

CodePudding user response:

We could do isin

df1 = df1.loc[~df1['client_id'].isin(df2['client_id'])].reset_index(drop=True)
df1
Out[278]: 
  client_id name_fran id_fran
0    451245     bruce     221
1    563234      john     312
2    753214     jorge     531
3    412456      well     678
4    442312      dani     214
5    123411      rafa     456
  • Related