I have a dataframe and I want to replace the value 7
with the round number of mean of its columns with out other 7
in that columns. Here is a simple example:
import pandas as pd
df = pd.DataFrame()
df['a'] = [1, 2, 3]
df['b'] =[3, 0, -1]
df['c'] = [4, 7, 6]
df['d'] = [7, 7, 6]
a b c d
0 1 3 4 7
1 2 0 7 7
2 3 -1 6 6
And here is the output I want:
a b c d
0 1 3 4 2
1 2 0 3 2
2 3 -1 6 6
For example, in row 1, the mean of column c is equal to 3.33 and then its round is 3, and in column column d is equal to 2 (since we do not consider the other 7 in that column).
Can you please help me with that?
CodePudding user response:
here is one way to do it
# replace 7 with np.nan
df.replace(7,np.nan, inplace=True)
# fill NaN values with the mean of the column
(df.fillna(df.apply(lambda x: x.replace(np.nan, 0)
.mean(skipna=False) ))
.round(0)
.astype(int))
a b c d
0 1 3 4 2
1 2 0 3 2
2 3 -1 6 6
CodePudding user response:
temp = df.replace(to_replace=7, value=0, inplace=False).copy()
df.replace(to_replace=7, value=temp.mean().astype(int), inplace=True)