Home > Software design >  pandas pivot_tables doesn't work with date data (No numeric types to aggregate)
pandas pivot_tables doesn't work with date data (No numeric types to aggregate)

Time:03-16

I have the following dataframe:

index   id    code    data    date
0      AZ234  B213    apple   2020-09-01 <- duplicate id, code, data
1      AZ234  B213    apple   2022-02-02 <- duplicate id, code, data
2      AZ234  B213    banana  2020-07-01
3      AZ234  B213    orange  2020-05-11
4      AL612  B309    apple   2020-12-05
5      AL612  B309    banana  2020-07-21
6      AL612  B309    orange  2020-09-21

...

I want to create pivot table to get the following table:

id    code    apple         banana        orange
AZ234  B213   2020-09-01    2020-07-01     2020-05-11
AL612  B309   2020-12-05    2020-07-21     2020-09-21
...

I have tried to do this using pivot_table (pandas):

pd.pivot_table(df, values='date', index=['id','code'],
                       columns=['data'])

but I get this error:

DataError: No numeric types to aggregate

I have read this post but it seems to be a bit different as I don't want to change the columns and also I got error when I tried to set_index with code and id ( " ValueError: Index contains duplicate entries, cannot reshape").

My goal is to create pivot table with dates as values of the table.

CodePudding user response:

There are duplicates per id, date, data so is necessary add some aggregate function:

If there are datetimes:

df['date'] = pd.to_datetime(df['date'])

df.pivot_table(values='date', index=['id','code'], columns=['data'], aggfunc='first')

df.pivot_table(values='date', index=['id','code'], columns=['data'], aggfunc='max')

If there are strings:

print (df['date'].dtype)

df.pivot_table(values='date', index=['id','code'], columns=['data'], aggfunc=','.join)
  • Related