I have a pandas dataframe df as below:
A B
0 70.0 20
1 NaN 20
2 28.0 100
3 75.0 120
4 56.0 30
5 84.0 90
6 NaN 100
7 19.0 10
8 93.0 80
9 94.0 70
10 72.0 20
I am trying to change the values of A as an average value based on B’s A. For instance, for B = 20, I would like all A values to be an average of 70 and 72 ignoring NaN. What is the best way possible please? I am thinking along the groupby lines as in…
df['AA']=df.groupby('B')['A'].transform(lambda s: s=s.mean())
That did not help.
CodePudding user response:
IIUC
created column A2, just so as to have a reference of what 'A' was. you can always updated back the column 'A'
df['AA']=df[~df['A'].isnull()].groupby('B')['A'].transform('mean')
df
A B AA
0 70.0 20 71.0
1 NaN 20 NaN
2 28.0 100 28.0
3 75.0 120 75.0
4 56.0 30 56.0
5 84.0 90 84.0
6 NaN 100 NaN
7 19.0 10 19.0
8 93.0 80 93.0
9 94.0 70 94.0
10 72.0 20 71.0
CodePudding user response:
mean
by default ignores NaN
s... so the simplest method would just be:
df['AA'] = df.groupby('B').transform('mean')
Output:
A B AA
0 70.0 20 71.0
1 NaN 20 71.0
2 28.0 100 28.0
3 75.0 120 75.0
4 56.0 30 56.0
5 84.0 90 84.0
6 NaN 100 28.0
7 19.0 10 19.0
8 93.0 80 93.0
9 94.0 70 94.0
10 72.0 20 71.0