Home > Back-end >  I want to get grouped counts and percentages using pandas
I want to get grouped counts and percentages using pandas

Time:02-06

table shows count and percentage count of cycle ride, grouped by membership type (casual, member).

What I have done in R:

big_frame %>% 
     group_by(member_casual) %>% 
     summarise(count = length(ride_id),
               '%' = round((length(ride_id) / nrow(big_frame)) * 100, digit=2))

best I've come up with in Pandas, but I feel like there should be a better way:

member_casual_count = (
    big_frame
    .filter(['member_casual'])
    .value_counts(normalize=True).mul(100).round(2)
    .reset_index(name='percentage')
)

member_casual_count['count'] = (
    big_frame
    .filter(['member_casual'])
    .value_counts()
    .tolist()  
)
member_casual_count

Thank you in advance

CodePudding user response:

In R, you should be doing something like this:

big_frame %>%
  count(member_casual) %>%
  mutate(perc = n/sum(n))

In python, you can achieve the same like this:

(
    big_frame
    .groupby("member_casual")
    .size()
    .to_frame('ct')
    .assign(n = lambda df: df.ct/df.ct.sum())
)
  • Related