Home > front end >  panda groupby agg and calculated function together
panda groupby agg and calculated function together

Time:07-11

I have below contents in csv file

key1    key2    Key3    key4    key5

Val1    A        51     True    25
Val1    A        50     False   25
Val1    A        49     True    25
Val1    A        48     True    25
Val2    A        47     False   25
Val2    A        46     True    25
Val2    A        45     False   25
Val2    A        44     True    25
Val2    A        43     True    25

I have to group by key1 and key2 then find the max of key5 and count of total rows of key4 and count of total true rows of key4

What I am trying

print(json_data.groupby(["key1","key2"]).agg(
        max_key5=('key5','max'),
        total_key4=('key4','count'),
        total_true_key4 = ("key4", lambda x: count(x == TRUE))   
    )) 

I also tried like belwo

print(json_data.groupby(["section","row"]).agg(
    max_key5=('key5','max'),
    total_skey4=('key4','count'),
    total_true_key4 = pd.NamedAgg(
         column="key4", 
         aggfunc=lambda x: (x == True))    
     ))

I am not getting total_true_key4 proper value

Any suggestion highly appreciated..

thanks

CodePudding user response:

An easy way to count the number of Trues in any array is to take the sum of that array (since they're typically based on the integer values 0 & 1; True == 1, and False == 0).

out = (
    json_data.groupby(['key1', 'key2'])
    .agg(
        maxkey5=('key5', 'max'), 
        total_key4=('key4', 'count'), 
        total_true_key4=('key4', 'sum')
    )
)

print(out)
           maxkey5  total_key4  total_true_key4
key1 key2                                      
Val1 A          25           4                3
Val2 A          25           5                3

CodePudding user response:

I would probably simply use the sum function:

print(json_data.groupby(["key1","key2"]).agg(
    max_key5=('key5','max'),
    total_key4=('key4','count'),
    total_true_key4 = ("key4", "sum")   
)) 

It counts the True values as 1 and the False values as 0.

  • Related