Home > Enterprise >  What is the idiomatic way to write pandas groupby result to DataFrame?
What is the idiomatic way to write pandas groupby result to DataFrame?

Time:01-03

Source df as like:

EventType User Item
View        A     1    
View        B     1
Like        C     2
View        C     2
Buy         A     1 

We have 5 users: A B C D E

We have 6 Items: 1 2 3 4 5 6

I would like to generate new df like

Event_Type Event_Ratio    ItemsHaveEvent  UsersHaveEvent
View           0.6            0.33         0.6
Like           0.2            0.167        0.2
Buy            0.2            0.167        0.2

Event_Type: same as EventType in original df

Event_Ratio: the event / total events

ItemsHaveEvent: items have this event / total items

UsersHaveEvent: users have this event / total users

How to write idiomatic pandas code in declarative way to do this?

CodePudding user response:

One option is with named aggregation:

total_items = 6
total_users = 5
total_events = len(df)

(df
.groupby('EventType', sort = False, as_index = False)
.agg(
    EventRatio = ('EventType', lambda f: f.size/total_events),
     ItemsHaveEvent = ('Item', lambda f: f.nunique()/total_items),
     UsersHaveEvent = ('User', lambda f: f.nunique()/total_users))
)

  EventType  EventRatio  ItemsHaveEvent  UsersHaveEvent
0      View         0.6        0.333333             0.6
1      Like         0.2        0.166667             0.2
2       Buy         0.2        0.166667             0.2

  • Related