Home > other >  Remove the values in pandas column list that match another column in that same row
Remove the values in pandas column list that match another column in that same row

Time:07-26

I have a pandas df where one of the columns is made up of lists. I want to remove the values in that list that match another column in that same row. Please note sometimes the 'similar_ids' is empty or only has one value. Example is below:

original

ID     similar_ids 
1       1, 234, 3215
2       2, 52, 1
3       49, 3
4       4
5

desired

ID     similar_ids 
1       234, 3215
2       52, 1
3       49
4       
5

CodePudding user response:

import pandas as pd

d = {'ID':[1, 2, 3, 4, 5], 'similar_ids':[[1, 234, 3215], [2, 52, 1], [49, 3], [4], []]}
df = pd.DataFrame(data=d)

for i in range(len(df['ID'])):
    if df['ID'][i] in df['similar_ids'][i]:
        df['similar_ids'][i].remove(df['ID'][i])

CodePudding user response:

df['similar_ids'] = df.apply(lambda row: [x for x in row.similar_ids if x != row.ID], axis=1)
print(df)

Output:

   ID  similar_ids
0   1  [234, 3215]
1   2      [52, 1]
2   3         [49]
3   4           []
4   5           []
  • Related