Home > Net >  Change negative values in a column containing nan in python dataframe?
Change negative values in a column containing nan in python dataframe?

Time:06-09

hi I have a dataframe like this, I want to replace negtive values with 0, but at the same time keep Nan values. The below code doesn't work because df['Data']<0 can't supported between instances of 'str' and 'int'. Any simple suggetions?

df[(df['Data'].notnull())& (df['Data']<0)]

    Data
0   1
1   0.5
2   0
3   -0.5
4   -1
5   Nan
6   Nan

wanted result

    Data
0   1
1   0.5
2   0
3   0
4   0
5   Nan
6   Nan

CodePudding user response:

to replace numbers less than 0 by 0, while keeping NaN as is, you can use loc and equate it to 0. Code here

data1 = {'Data': [1, 0.5, -0.5, 0, -1, np.nan, np.nan]}
df=pd.DataFrame(data1)
>>df
    Data
0   1.0
1   0.5
2   -0.5
3   0.0
4   -1.0
5   NaN
6   NaN

df.loc[df['Data']<0,'Data'] = 0
>>df
    Data
0   1.0
1   0.5
2   0.0
3   0.0
4   0.0
5   NaN
6   NaN

CodePudding user response:

Going by your error message - it looks like your Data column has an object dtype - you can get around it by converting it to float

>>> x = [1, 0.5, 0, -0.5, -1, 'nan', 'Nan']
>>> df = pd.DataFrame(x, columns=['Data']) 

This gives me the same error you describe -

>>> df[(df['Data'].notnull())& (df['Data']<0)] 
TypeError: '<' not supported between instances of 'str' and 'int'

But this replaces the negative numbers while keeping the nan intact

>>> df.loc[(df['Data'].astype(float).notnull())& (df['Data'].astype(float)<0), ['Data']] = 0 
>>> df
  Data
0    1
1  0.5
2    0
3    0
4    0
5  nan
6  Nan
  • Related