Home > Blockchain >  Using pandas find the minimum value in a column based on category levels of two other columns
Using pandas find the minimum value in a column based on category levels of two other columns

Time:06-07

I have a dataframe, df with three columns A, B, and C as given below. What I want is a dataframe, df2 with three columns X, Y, and Z; where column Z takes the minimum value of column C when value in column B is "YES" and this is done for each category of value in column A given in column X. Also how can I add a column in df as given df3 highlighting the rows corresponding to this minimum value by 1.

What I have tried?

df2 = df.loc[df.groupby("A")["C"].idxmin()]

This is only for two columns. How can I apply condition on column B?

df

A   B   C
1   YES 20
1   YES 13
1   NO  18
1   NO  4
1   NO  9
1   YES 42
2   YES 22
2   NO  2
2   YES 3
2   YES 7
2   NO  51

df2

X   Y   Z
1   YES 13
2   YES 3

df3

A   B   C   D
1   YES 20  0
1   YES 13  1
1   NO  18  0
1   NO  4   0
1   NO  9   0
1   YES 42  0
2   YES 22  0
2   NO  2   0
2   YES 3   1
2   YES 7   0
2   NO  51  0

CodePudding user response:

Filter the YES's, group by column A taking the min, then change your column names:

df.query('B == "YES"').groupby('A', as_index=False).min().set_axis(['X', 'Y', 'Z'], axis=1)

CodePudding user response:

you can group by , take minimum and then rename the columns

df2=df[df['B']=='YES'].groupby('A')['C'].min().to_frame().reset_index()
df2['Z'] = 'YES'
df2.rename(columns={'A':'X', 'C':'Y'}, inplace=True)
df2
    X   Y   Z
0   1   13  YES
1   2   3   YES
  • Related