Pandas change existing column values based on another column-CodePudding

I have a pandas dataframe df as below:

       A    B
0   70.0   20
1    NaN   20
2   28.0  100
3   75.0  120
4   56.0   30
5   84.0   90
6    NaN  100
7   19.0   10
8   93.0   80
9   94.0   70
10  72.0   20

I am trying to change the values of A as an average value based on B’s A. For instance, for B = 20, I would like all A values to be an average of 70 and 72 ignoring NaN. What is the best way possible please? I am thinking along the groupby lines as in…

df['AA']=df.groupby('B')['A'].transform(lambda s: s=s.mean())

That did not help.

CodePudding user response：

IIUC

created column A2, just so as to have a reference of what 'A' was. you can always updated back the column 'A'

df['AA']=df[~df['A'].isnull()].groupby('B')['A'].transform('mean')
df

       A      B       AA
0   70.0     20     71.0
1    NaN     20      NaN
2   28.0    100     28.0
3   75.0    120     75.0
4   56.0     30     56.0
5   84.0     90     84.0
6    NaN    100      NaN
7   19.0     10     19.0
8   93.0     80     93.0
9   94.0     70     94.0
10  72.0     20     71.0

CodePudding user response：

mean by default ignores NaNs... so the simplest method would just be:

df['AA'] = df.groupby('B').transform('mean')

Output:

       A    B    AA
0   70.0   20  71.0
1    NaN   20  71.0
2   28.0  100  28.0
3   75.0  120  75.0
4   56.0   30  56.0
5   84.0   90  84.0
6    NaN  100  28.0
7   19.0   10  19.0
8   93.0   80  93.0
9   94.0   70  94.0
10  72.0   20  71.0