Home > Mobile >  python pandas dataframe aggregate rows
python pandas dataframe aggregate rows

Time:07-03

I have a dataframe like so.

   id   K  V
0   1  k1  3
1   1  k2  4
2   1  k2  5
3   1  k1  5
4   2  k1  2
5   2  k1  3
6   2  k2  3

And i also have a set of conditions like k1 > 1 and k2 < 4.

I want to process the conditions and create a new dataframe containing 1 row per id and columns for each conditions.

   id  k1_condition  k2_condition
0   1  True          False
1   2  True          True

CodePudding user response:

here is one way to do it

df2=df.pivot_table(index='id',columns='K', values='V', aggfunc=['min','max']).reset_index()
df2.columns = ['_'.join(col) for col in df2.columns ]
df2['k1_condition'] = df2['min_k1'] > 1
df2['k2_condition'] = df2['max_k2'] <4
df2=df2.drop(columns=['min_k1','min_k2','max_k1','max_k2'])
df2

OR

df2=df.pivot_table(index='id',columns='K', values='V', aggfunc=['min','max']).reset_index()
df2['k1_condition'] = df2['min']['k1'] > 1
df2['k2_condition'] = df2['max']['k2'] <4
df2.drop(columns=['min','max'],level=0,inplace=True)
df2

    id_     k1_condition    k2_condition
0   1       True            False
1   2       True            True

CodePudding user response:

Dataframe.apply should work:

df["k1_condition"] = df.apply(lambda x: x["K"]=="k1" & x["V"]>1, axis=1)
df["k2_condition"] = df.apply(lambda x: x["K"]=="k2" & x["V"]>4, axis=1)
df2 = df[["id", "k1_condition", "k2_condition"]].groupy("id").any()

CodePudding user response:

You could use groupby using a conditions function then pivot into the shape you want.

def conditions(x):
    if x.name[1]=='k1':
        return (x>1).all()
    
    return (x<4).all()

df.groupby(['id','K']).apply(conditions).reset_index().pivot(index='id', columns='K') \
    .reset_index().set_axis(['id','k1_conditon', 'k2_condition'], axis=1, inplace=False)

Result

   id   k1_conditon  k2_condition
0   1          True         False
1   2          True          True
  • Related