Home > Software design >  How to add the result of a divide function as a new column to the dataframe in Python?
How to add the result of a divide function as a new column to the dataframe in Python?

Time:09-27

I've calculated the conditional probability using the code below. Now I would like to add the result of this calculation as a new column to my dataframe. Would that be possible with this code?

df.groupby(['mode','income_level'])['service'].value_counts() / df.groupby(['mode','income_level'])['service'].count()

CodePudding user response:

Use DataFrame.join if need new column from your solution:

df = pd.DataFrame({'mode':list('aaaabbbb'),
                    'income_level':[5,5,5,0,5,0,0,0],
                    'service':[1,0] * 4})

a = (df.groupby(['mode','income_level'])['service'].value_counts() / 
     df.groupby(['mode','income_level'])['service'].count())

df = df.join(a.rename('new1'), on=['mode','income_level', 'service'])

Or use GroupBy.transform, instead value_counts add column to groupby and use GroupBy.size:

s1 = df.groupby(['mode','income_level', 'service'])['service'].transform('size') 
s2 = df.groupby(['mode','income_level'])['service'].transform('count') 

df['new'] = s1 / s2
print (df)
  mode  income_level  service      new1       new
0    a             5        1  0.666667  0.666667
1    a             5        0  0.333333  0.333333
2    a             5        1  0.666667  0.666667
3    a             0        0  1.000000  1.000000
4    b             5        1  1.000000  1.000000
5    b             0        0  0.666667  0.666667
6    b             0        1  0.333333  0.333333
7    b             0        0  0.666667  0.666667
  • Related