hi I have a dataframe like this, I want to replace negtive values with 0, but at the same time keep Nan values. The below code doesn't work because df['Data']<0
can't supported between instances of 'str' and 'int'. Any simple suggetions?
df[(df['Data'].notnull())& (df['Data']<0)]
Data
0 1
1 0.5
2 0
3 -0.5
4 -1
5 Nan
6 Nan
wanted result
Data
0 1
1 0.5
2 0
3 0
4 0
5 Nan
6 Nan
CodePudding user response:
to replace numbers less than 0 by 0, while keeping NaN as is, you can use loc
and equate it to 0.
Code here
data1 = {'Data': [1, 0.5, -0.5, 0, -1, np.nan, np.nan]}
df=pd.DataFrame(data1)
>>df
Data
0 1.0
1 0.5
2 -0.5
3 0.0
4 -1.0
5 NaN
6 NaN
df.loc[df['Data']<0,'Data'] = 0
>>df
Data
0 1.0
1 0.5
2 0.0
3 0.0
4 0.0
5 NaN
6 NaN
CodePudding user response:
Going by your error message - it looks like your Data
column has an object
dtype - you can get around it by converting it to float
>>> x = [1, 0.5, 0, -0.5, -1, 'nan', 'Nan']
>>> df = pd.DataFrame(x, columns=['Data'])
This gives me the same error you describe -
>>> df[(df['Data'].notnull())& (df['Data']<0)]
TypeError: '<' not supported between instances of 'str' and 'int'
But this replaces the negative numbers while keeping the nan
intact
>>> df.loc[(df['Data'].astype(float).notnull())& (df['Data'].astype(float)<0), ['Data']] = 0
>>> df
Data
0 1
1 0.5
2 0
3 0
4 0
5 nan
6 Nan