Home > Mobile >  Pandas Drop duplicates, reverse of subset
Pandas Drop duplicates, reverse of subset

Time:03-12

I want to drop duplicates on my dataframe. I know I can use subset to type out all columns I want to perform it on, however I have 50 columns. Is there a way to include all columns and exclude a subset?

For example include column B,C,D,E,G,H,I, etc. and exclude A and F.

Something like: df.drop_duplicates(subset_to_exclude=['A', 'F'])

Thanks.

CodePudding user response:

Maybe this could be an approach for you (List comprehension)?

df = pd.DataFrame({
    'A': ['Yum Yum', 'Yum Yum', 'Indomie', 'Indomie', 'Indomie'],
    'B': ['cup', 'cup', 'cup', 'pack', 'pack'],
    'F': [4, 4, 3.5, 15, 5]
})

df.drop_duplicates(subset=[val for val in df.columns if val != "A" and val != "F"])

         A     B     F
0  Yum Yum   cup   4.0
3  Indomie  pack  15.0

print(df.drop_duplicates(subset=["B"]))

         A     B     F
0  Yum Yum   cup   4.0
3  Indomie  pack  15.0
  • Related