Given a dataframe df
as follows:
id value1 value2 value3
0 0 22 1 7
1 1 0 0 0
2 2 0 0 0
3 3 4 1 25
4 4 5 0 24
5 5 0 0 3
Or:
df = pd.DataFrame([{'id': 0, 'value1': 22, 'value2': 1, 'value3': 7},
{'id': 1, 'value1': 0, 'value2': 0, 'value3': 0},
{'id': 2, 'value1': 0, 'value2': 0, 'value3': 0},
{'id': 3, 'value1': 4, 'value2': 1, 'value3': 25},
{'id': 4, 'value1': 5, 'value2': 0, 'value3': 24},
{'id': 5, 'value1': 0, 'value2': 0, 'value3': 3}])
I need to subset columns 'value1', 'value2', 'value3'
, replace 0
with NaN
if all values are 0
s row-wisely. How could I acheive that in Pandas?
The exepected result will like this:
id value1 value2 value3
0 0 22.0 1.0 7.0
1 1 NaN NaN NaN
2 2 NaN NaN NaN
3 3 4.0 1.0 25.0
4 4 5.0 0.0 24.0
5 5 0.0 0.0 3.0
Code I have tried:
cols = ['value1', 'value2', 'value3']
df[cols] = df[cols].replace({'0': np.NaN, 0: np.NaN})
CodePudding user response:
Use mask
:
df[cols] = df[cols].mask(df[cols].eq(0).all(axis=1))
mask
automatically sets the row to NaN
if the condition (df[cols].eq(0).all(axis=1)
) is True
.
Original answer:
I'd prefer mask
:
>>> df.set_index('id').mask(df[cols].eq(0).all(axis=1))
value1 value2 value3
id
0 22.0 1.0 7.0
1 NaN NaN NaN
2 NaN NaN NaN
3 4.0 1.0 25.0
4 5.0 0.0 24.0
5 0.0 0.0 3.0
>>>
With resetting index:
>>> df.set_index('id').mask(df[cols].eq(0).all(axis=1)).reset_index()
id value1 value2 value3
0 0 22.0 1.0 7.0
1 1 NaN NaN NaN
2 2 NaN NaN NaN
3 3 4.0 1.0 25.0
4 4 5.0 0.0 24.0
5 5 0.0 0.0 3.0
>>>
CodePudding user response:
Perhaps:
mask = (df[cols].eq(0).all(axis=1))
df.loc[mask, cols] = np.nan
Output:
id value1 value2 value3
0 0 22.0 1.0 7.0
1 1 NaN NaN NaN
2 2 NaN NaN NaN
3 3 4.0 1.0 25.0
4 4 5.0 0.0 24.0
5 5 0.0 0.0 3.0
CodePudding user response:
You're mixing types, your code tries to compare two different data types and throws an TypeError: TypeError: Cannot compare types 'ndarray(dtype=float64)' and 'str'
To make your code work you simply have to remove '0': np.NaN
from your dict or instead of a dict
pass two arguments:
df[cols] = df[cols].replace(0, float('NaN'))