Home > database >  Remove duplicate records in Excel sheet using python
Remove duplicate records in Excel sheet using python

Time:11-16

I used below code to remove duplicates based on "Name" and "Discription" column from an Excel sheet, but it didn't worked. Can anyone explain the reason or can anyone suggest another method to remmove duplicate? df = pd.read_excel("file_path") df.drop_duplicates(subset=["Name","Discription"], keep='first') df.to_excel("New.xlsx",index=False) It gives same file with duplicates as New.xlsx file.

CodePudding user response:

You need to reassign your dataframe before saving the new spreadsheet:

df= df.drop_duplicates(subset=["Name","Discription"], keep="first")

Or use inplace=True :

df.drop_duplicates(subset=["Name","Discription"], keep="first", inplace=True)

In one bloc:

(
    pd.read_excel("file_path")
        .drop_duplicates(subset=["Name","Discription"], keep="first")
        .to_excel("New.xlsx", index=False)
)
  • Related