Home > database >  How to replace numbers in selected columns that falls in certain value python?
How to replace numbers in selected columns that falls in certain value python?

Time:11-29

How do you replace numbers with np.nan in selected columns if the number falls in between 2 ranges?

A B C D
2 3 5 7
2 8 9 7
5 3 6 7

select columns B & C replace numbers if number is <=5 and >=7

A B C D
2 NaN 5 7
2 NaN NaN 7
5 NaN 6 7

CodePudding user response:

Use a boolean mask for in place modification (boolean indexing):

cols = ['B', 'C']
m = (df[cols].gt(7)|df[cols].lt(5)).reindex(columns=df.columns, fill_value=False)

df[m] = np.nan

If you need a copy:

cols = ['B', 'C']
out = df.mask((df[cols].gt(7)|df[cols].lt(5))
              .reindex(columns=df.columns, fill_value=False))

Output:

   A   B    C  D
0  2 NaN  5.0  7
1  2 NaN  NaN  7
2  5 NaN  6.0  7

Intermediates:

(df[cols].gt(7)|df[cols].lt(5))

      B      C
0  True  False
1  True   True
2  True  False

(df[cols].gt(7)|df[cols].lt(5)).reindex(columns=df.columns, fill_value=False)

       A     B      C      D
0  False  True  False  False
1  False  True   True  False
2  False  True  False  False

CodePudding user response:

You can assign back to filtered columns with DataFrame.mask:

cols = ['B', 'C']

df[cols] = df[cols].mask(df[cols].gt(7) | df[cols].lt(5))
print (df)
   A   B    C  D
0  2 NaN  5.0  7
1  2 NaN  NaN  7
2  5 NaN  6.0  7

Or with numpy.where:

cols = ['B', 'C']

df[cols] = np.where(df[cols].gt(7) | df[cols].lt(5), np.nan, df[cols])
print (df)
   A   B    C  D
0  2 NaN  5.0  7
1  2 NaN  NaN  7
2  5 NaN  6.0  7
  • Related