Home > other >  Using goupby with a condition on another column in pandas
Using goupby with a condition on another column in pandas

Time:12-11

I'm trying to calculate the summation of some elements in my data frame, if another column in my data has a non-zero value. I came up with this:

data = {'Type': ['A', 'A', 'B', 'C', 'C'] ,
    'Vol' : [10, 20, 15, 15, 15] ,
    'Cost' : [500, 300, 200, 400, 400] , 
    'IsSold' : [1, 0, 1, 1, 0]}
totalA = 0
totalB = 0
totalC = 0
totalD = 0
for idx, el in enumerate(df.iloc[:,:]):
if df['Type'][idx] == "A" and df['IsSold'][idx] == 1 :
    totalA = df['Vol'][idx]

It's working but I'm wondering if there is a better way to calculate desired parameters. because the number of different groups in my data (based on Type column) is high and I think it's not the optimum solution. I also tried using groupby function as below:

df.groupby('Type')[['Vol']].sum()

but I don't know how to check the condition in this method.

CodePudding user response:

You can do a filtering before the .groupby:

out = df[df.IsSold == 1].groupby("Type")["Vol"].sum()
print(out)

Prints:

Type
A    10
B    15
C    15
Name: Vol, dtype: int64

CodePudding user response:

You can either groupby Type and IsSold or you filter first the rows where InSold == 1 and do your groupby. In code that could look like as

df.query('IsSold==1').groupby('Type')['Vol'].sum()
  • Related