Home > database >  Replace specific values in a data frame with column mean
Replace specific values in a data frame with column mean

Time:10-17

I have a dataframe and I want to replace the value 7 with the round number of mean of its columns with out other 7 in that columns. Here is a simple example:

import pandas as pd
df = pd.DataFrame()
df['a'] = [1, 2, 3]
df['b'] =[3, 0, -1]
df['c'] = [4, 7, 6]
df['d'] = [7, 7, 6]

  a     b   c   d
0   1   3   4   7
1   2   0   7   7
2   3   -1  6   6

And here is the output I want:

  a     b   c   d
0   1   3   4   2
1   2   0   3   2
2   3   -1  6   6

For example, in row 1, the mean of column c is equal to 3.33 and then its round is 3, and in column column d is equal to 2 (since we do not consider the other 7 in that column).

Can you please help me with that?

CodePudding user response:

here is one way to do it

# replace 7 with np.nan
df.replace(7,np.nan, inplace=True)

# fill NaN values with the mean of the column
(df.fillna(df.apply(lambda x: x.replace(np.nan, 0)
                    .mean(skipna=False)  ))
 .round(0)
 .astype(int))

    a   b   c   d
0   1   3   4   2
1   2   0   3   2
2   3   -1  6   6

CodePudding user response:

temp = df.replace(to_replace=7, value=0, inplace=False).copy()
df.replace(to_replace=7, value=temp.mean().astype(int), inplace=True)
  • Related