How do you replace numbers with np.nan in selected columns if the number falls in between 2 ranges?
A | B | C | D |
---|---|---|---|
2 | 3 | 5 | 7 |
2 | 8 | 9 | 7 |
5 | 3 | 6 | 7 |
select columns B & C replace numbers if number is <=5 and >=7
A | B | C | D |
---|---|---|---|
2 | NaN | 5 | 7 |
2 | NaN | NaN | 7 |
5 | NaN | 6 | 7 |
CodePudding user response:
Use a boolean mask for in place modification (boolean indexing):
cols = ['B', 'C']
m = (df[cols].gt(7)|df[cols].lt(5)).reindex(columns=df.columns, fill_value=False)
df[m] = np.nan
If you need a copy:
cols = ['B', 'C']
out = df.mask((df[cols].gt(7)|df[cols].lt(5))
.reindex(columns=df.columns, fill_value=False))
Output:
A B C D
0 2 NaN 5.0 7
1 2 NaN NaN 7
2 5 NaN 6.0 7
Intermediates:
(df[cols].gt(7)|df[cols].lt(5))
B C
0 True False
1 True True
2 True False
(df[cols].gt(7)|df[cols].lt(5)).reindex(columns=df.columns, fill_value=False)
A B C D
0 False True False False
1 False True True False
2 False True False False
CodePudding user response:
You can assign back to filtered columns with DataFrame.mask
:
cols = ['B', 'C']
df[cols] = df[cols].mask(df[cols].gt(7) | df[cols].lt(5))
print (df)
A B C D
0 2 NaN 5.0 7
1 2 NaN NaN 7
2 5 NaN 6.0 7
Or with numpy.where
:
cols = ['B', 'C']
df[cols] = np.where(df[cols].gt(7) | df[cols].lt(5), np.nan, df[cols])
print (df)
A B C D
0 2 NaN 5.0 7
1 2 NaN NaN 7
2 5 NaN 6.0 7