Home > Blockchain >  Pandas: Multiply row value by groupby of another column as a new column
Pandas: Multiply row value by groupby of another column as a new column

Time:10-07

I have a DataFrame that looks like this. I am trying to add a new column df['new_sales'] where I multiply df['rate'] by the groupby sum of df['state','store'].

import pandas as pd
data = [['california', 'a', 11, 0.6], ['california', 'a', 12, 0.4], ['california', 'b', 32, 0.7]]
df= pd.DataFrame(data, columns=['state','store','sales','rate'])

I was trying something like this but couldn't get it to work.

df['new_sales'] = df.groupby(['state','store'])['sales'].apply(lambda x: x.sum()*df['rate'])

The output would look like this.

enter image description here

CodePudding user response:

1.without groupby

 df = df.groupby(['state','store'])['sales'].apply(lambda x: x.sum()*df['rate'])

output:

enter image description here



2.with groupby:

def doCalculation(df):
    sales = df['sales'].sum()
    rate = df['rate']
    

    return sales * rate

df = df.groupby(['state','store']).apply(doCalculation)

output:

enter image description here


  1. newdf['NewSales'] = df.values

out:

enter image description here

CodePudding user response:

Use the transform option, to align the values with the length of the original dataframe; should be faster than an apply, and without the anonymous function :

df['NewSales'] = df.groupby(['state', 'store']).sales.transform('sum') * df.rate

print(Df)

         state store  sales  rate  NewSales
0  california     a     11   0.6      13.8
1  california     a     12   0.4       9.2
2  california     b     32   0.7      22.4

  • Related