Home > Mobile >  Pandas group by and fraction where two columns equal
Pandas group by and fraction where two columns equal

Time:05-12

I want to get the fraction of rows where two columns are equal, where I group by another column.

In the example below, I want to group by col1, and compare col2 == col3.

Input:

col1 | col2 | col3
A    |  c   |  c
A    |  d   |  g
B    |  c   |  c
B    |  d   |  d

Desired output:

A | 0.5 #its 50 % because c==c, d != g
B | 1.0 #its 100% because c==c, d==d

Is there a way to do this in pandas in one command?

CodePudding user response:

You can try group by col1 then sum the rows that col2 is equal with col3 and divide by the group length.

out = df.groupby('col1').apply(lambda g: g['col2'].eq(g['col3']).sum()/len(g))
print(out)

col1
A    0.5
B    1.0
dtype: float64

CodePudding user response:

groupby mean:

df['col2'].eq(df['col3']).groupby(df['col1']).mean()

col1
A    0.5
B    1.0
dtype: float64
  • Related