Pandas groupby a column and apply function to create a new column-CodePudding

I am trying to set categories based on the grouping of members with the same admitting code and their mean length of stay value.

By this I mean, I have the following data frame:

MemberID	AdmittingCode	LOS
1	a	5
2	a	10
1	b	2
2	b	1

Now, in the above data frame I want to group based on admitting code and take the mean of LOS for that particular admitting code and if LOS is less than the mean it would be set as '0' category or else '1'.

So, for the above case for admitting code 'a', we have LOS as 5 and 10. Here, the mean is 7.5 so the MemeberID of 1 with AdmittingCode as 'a' with LOS '5' would be set as category 0. Similarly with the logic the following data frame is acquired:

MemberID	AdmittingCode	LOS	LOSCategory
1	a	5	0
2	a	10	1
1	b	2	1
2	b	1	0

CodePudding user response：

Use GroupBy.transform with mean and compare original column:

m = df.groupby('AdmittingCode')['LOS'].transform('mean').lt(df['LOS'])
df['LOSCategory'] = m.astype(int)
print (df)
   MemberID AdmittingCode  LOS  LOSCategory
0         1             a    5            0
1         2             a   10            1
2         1             b    2            1
3         2             b    1            0

Or if need set to strings 1, 0:

df['LOSCategory'] = m.astype(int).astype(str)

df['LOSCategory'] = np.where(m, '1', '0')