Home > Mobile >  How to move ALL duplicated rows into separate dataframe
How to move ALL duplicated rows into separate dataframe

Time:03-12

My code is removing all duplicates using the drop_duplicates, keep=false.

The issue I'm having is that before I remove the duplicates I want to move all removed duplicates to a separate dataframe. I've come up with the below line of code, however I think its leaving one duplicate remaining and not removing ALL duplicates.

duplicates_df = combined_df.loc[combined_df.duplicated(subset='Unique_ID_Count'), :]

combined_df.drop_duplicates(subset='Unique_ID_Count', inplace=True, keep=False)

Do you have any ideas on how I can move all duplicates dropped in the second line of code to the duplicates_df dataframe?

Any help would be much appreciated, thanks!

CodePudding user response:

Try this:

duplicates_df = combined_df.loc[combined_df.duplicated(subset='Unique_ID_Count', keep=False)]
combined_df   = combined_df.loc[~combined_df.duplicated(subset='Unique_ID_Count', keep=False)]
  • Related