Home > Software engineering >  Pandas - Creating pivot of duplicates index column
Pandas - Creating pivot of duplicates index column

Time:10-08

I am trying to create a pivot of my Dataframe that has no numerical values and duplicates exist in the index column. Given below is how my data looks:

sale_id, product, sale_date
101, ABC, 2021-01-01
101, DEF, 2021-02-01
101, XYZ, 2021-03-01
101, KLM, 2021-01-04

Expect the below output:

    ABC, DEF, XYZ, KLM
101 2021-01-01, 2021-02-01, 2021-03-01, 2021-01-04

I tried the below

df.pivot(index='sale_id', columns='product', values='sale_date')

It threw the below error

ValueError: Index contains duplicate entries, cannot reshape

CodePudding user response:

I am trying to create a pivot of my Dataframe that has no numerical values and duplicates exist in the index column.

For test duplicates use DataFrame.duplicated:

df1 = df[df.duplicated(['sale_id','product'], keep=False)]
print (df1)

For remove duplicates use DataFrame.drop_duplicates:

(df.drop_duplicates(['sale_id','product'])
   .pivot(index='sale_id', columns='product', values='sale_date'))
  • Related