Home > Blockchain >  Groupby and average - Python
Groupby and average - Python

Time:05-28

Currently have a dataframe which is:

price     type                 randomc1     randomc2   randomc3
2         Dumpling
1         Milk Based Drinks
2         Dumpling
3         Milk Based Drinks
7         Cold Cuts
5         Cold Cuts

Wanted to get the average of prices of the similar types.

wanted output:

type                        average
Dumpling                       2
Milk Based Drinks              2
Cold Cuts                      6

also, there are about 100 different "types". so, ideally would want to get every "type" to be printed.

any help would be greatly appreciated.

edit: output to print(df.to_dict()) 

{'Dish_Type': ['Dumpling',
  'Dumpling',
  'Milk Based Drinks',
  'Milk Based Drinks',
  'Dumpling'],
 'Dish_Price': ['$9.95', '$7.95', '$8.95', '$8.95', '$9.95']}

CodePudding user response:

You can use:

out = (df.assign(Dish_Price=df['Dish_Price'].str.lstrip('$').astype(float))
         .groupby('Dish_Type', as_index=False)
         .agg(Dish_Average=('Dish_Price', 'mean')))
print(out)

# Output
           Dish_Type  Dish_Average
0           Dumpling      9.283333
1  Milk Based Drinks      8.950000

Setup:

data = {'Dish_Type': ['Dumpling', 'Dumpling', 'Milk Based Drinks',
                      'Milk Based Drinks', 'Dumpling'],
        'Dish_Price': ['$9.95', '$7.95', '$8.95', '$8.95', '$9.95']}
df = pd.DataFrame(data)

CodePudding user response:

Unless I understand correctly, it shouldn't matter how many different types you have, all of them will be considered when using groupby(). Have you tried:

df.groupby('type',as_index=False).agg(average=pd.NamedAgg('price','mean'))
  • Related