Home > OS >  Pandas how to create new data frame that only has duplicate ids
Pandas how to create new data frame that only has duplicate ids

Time:07-18

I am trying to create a new dataframe that has the columns id and name, for all the duplicate ids in the dataframe.

My dataframes structure is:

id, name,lat, lon, price, minimum_nights, review_cnt

I tried the .duplicated function, but I am not getting what I need. I think I might be using it wrong

CodePudding user response:

.duplicated() by default returns all duplicated features except the first feature. To get all duplicated features for 'id' and 'name' including the first occurrence:

df = df[['id', 'name']].copy()
df[df.duplicated(keep=False)]
  • Related