Home > front end >  Apply pandas groupby, sum and mean based on another column
Apply pandas groupby, sum and mean based on another column

Time:01-05

Im working on a data frame. Trying to group, sum, and mean on based on the other column. As all the 3 operations are in one. Im unable to find any solution. Please help

enter image description here

Expected Output: enter image description here

mydf = pd.DataFrame.from_records([{'Month-Year': 'Apr-2020',
  'meets_target_str': 'Target Met',
  'granularity_date': '2020-04-01',
  'meets_target': 31,
   'total' : 67,
        'percentage':46.26},
 {'Month-Year': 'Apr-2020',
  'meets_target_str': 'Target not met',
  'granularity_date': '2020-04-01',
  'meets_target': 36,
    'total' : 67,
 'percentage':53.73},
 {'Month-Year': 'May-2020',
  'meets_target_str': 'Target not met',
  'granularity_date': '2020-05-01',
  'meets_target': 41,
 'total' : 97,
 'percentage':42.26},
 {'Month-Year': 'May-2020',
  'meets_target_str': 'Target Met',
  'granularity_date': '2020-05-01',
  'meets_target': 56,
 'total' : 97,
 'percentage':57.73},
 {'Month-Year': 'Jun-2020',
  'meets_target_str': 'Target not met',
  'granularity_date': '2020-06-01',
  'meets_target': 41,
 'total' : 70,
 'percentage':58.47},
 {'Month-Year': 'Jun-2020',
  'meets_target_str': 'Target Met',
  'granularity_date': '2020-06-01',
  'meets_target': 29,
    'total' : 70,
  'percentage':41.42
 }])

CodePudding user response:

You can also do it like this with eval to get a one-liner:

mydf.assign(total=mydf.groupby(['Month-Year'])['meets_target'].transform('sum'))\
    .eval('percentage=meets_target/total*100')

Output:

  Month-Year meets_target_str granularity_date  meets_target  total  percentage
0   Apr-2020       Target Met       2020-04-01            31     67   46.268657
1   Apr-2020   Target not met       2020-04-01            36     67   53.731343
2   May-2020   Target not met       2020-05-01            41     97   42.268041
3   May-2020       Target Met       2020-05-01            56     97   57.731959
4   Jun-2020   Target not met       2020-06-01            41     70   58.571429
5   Jun-2020       Target Met       2020-06-01            29     70   41.428571

CodePudding user response:

mydf["total"] = mydf.groupby("granularity_date")["meets_target"].transform("sum")
mydf["percentage"] = mydf["meets_target"] / mydf["total"] * 100
  Month-Year meets_target_str granularity_date  meets_target  total  percentage
0   Apr-2020       Target Met       2020-04-01            31     67   46.268657
1   Apr-2020   Target not met       2020-04-01            36     67   53.731343
2   May-2020   Target not met       2020-05-01            41     97   42.268041
3   May-2020       Target Met       2020-05-01            56     97   57.731959
4   Jun-2020   Target not met       2020-06-01            41     70   58.571429
5   Jun-2020       Target Met       2020-06-01            29     70   41.428571

Optionally round percentage to 2 decimals with .round

  • Related