Home > Blockchain >  How to remove all values in a dataframe based on the values in another dataframe?
How to remove all values in a dataframe based on the values in another dataframe?

Time:10-09

I have two dataframes.

data = {
    'Title': ['Ak1', 'Ak2', 'Ak3', 'Ak4', 'Ak5', 'Ak6', 'Ak7', 'Ak8'],
    'Items': ['A', 'B', 'J', 'A', 'A', 'K', 'L', 'M'],
    'Item2': ['K', 'B', 'O', 'A', 'A', 'K', 'J', 'F'],
    'Item3': ['A', 'K', 'D', 'A', 'A', 'K', 'L', 'M'],
}
df = pd.DataFrame(data)

df

    Title   Items   Item2   Item3
0   Ak1     A       K       A
1   Ak2     B       B       K
2   Ak3     J       O       D
3   Ak4     A       A       A
4   Ak5     A       A       A
5   Ak6     K       K       K
6   Ak7     L       J       L
7   Ak8     M       F       M

second dataframe df2,

data = {
    'Remove': ['A', 'J', 'M']
}
df2 = pd.DataFrame(data)

df2

    Remove
0   A
1   J
2   M

I want to remove all the values in df1 which are thers in df2. Expected output is as follows.

    Title   Items   Item2   Item3
0   Ak1             K   
1   Ak2     B       B       K
2   Ak3             O       D
3   Ak4         
4   Ak5         
5   Ak6     K       K       K
6   Ak7     L               L
7   Ak8             F           

CodePudding user response:

You can use isin function:

df3 = df.mask(df.isin(df2['Remove'].values), '')
df3

    Title   Items   Item2   Item3
0   Ak1             K   
1   Ak2     B       B       K
2   Ak3             O       D
3   Ak4         
4   Ak5         
5   Ak6     K       K       K
6   Ak7     L               L
7   Ak8             F           

CodePudding user response:

How about using pandas.replace? If I try to implemented on your sample data, the result is followed:

df.replace(df2.Remove.unique(),'')

    Title   Items   Item2   Item3
0   Ak1               K 
1   Ak2       B       B       K
2   Ak3               O       D
3   Ak4         
4   Ak5         
5   Ak6       K       K       K
6   Ak7       L               L
7   Ak8               F 
  • Related