Home > database >  How to replace 0 with "NA" and all other values to 0 in Python DataFrame?
How to replace 0 with "NA" and all other values to 0 in Python DataFrame?

Time:01-19

Consider a dataframe

dict = {'col1': [1,2,np.nan,4],
        'col2': [1,2,0,4],
        'col3': [0,2,3,4],
        'col4': [1,2,0,np.nan],
}
df = pd.DataFrame(dict)
>> df

col1 col2 col3  col4
0   1.0 1   0   1.0
1   2.0 2   2   2.0
2   NaN 0   3   0.0
3   4.0 4   4   NaN

I want to replace all 0 with a string "NA" and all the rest with 0.

CodePudding user response:

Use numpy.where:

out = pd.DataFrame(np.where(df==0, 'NA', 0), index=df.index, columns=df.columns)

Output:

  col1 col2 col3 col4
0    0    0   NA    0
1    0    0    0    0
2    0   NA    0   NA
3    0    0    0    0

Variant if you want to keep some columns intact:

keep = ['col1']

out = pd.DataFrame(np.where(df==0, 'NA', 0), index=df.index, columns=df.columns)

out[keep] = df[keep]

print(out)

Output:

   col1 col2 col3 col4
0   1.0    0   NA    0
1   2.0    0    0    0
2   NaN   NA    0   NA
3   4.0    0    0    0

CodePudding user response:

Use numpy.where and assign back ouput:

df[:] = np.where(df.eq(0), 'NA', 0)
print (df)
   col1 col2 col3 col4
0    0    0   NA    0
1    0    0    0    0
2    0   NA    0   NA
3    0    0    0    0

Or use DataFrame.mask with convert Falses of mask to 0:

m = df.eq(0)
df1 = m.astype(int).mask(m, 'NA')
print (df1)
    col1 col2 col3 col4
0     0    0   NA    0
1     0    0    0    0
2     0   NA    0   NA
3     0    0    0    0

EDIT: If need processing all columns without col1 column:

cols = df.columns.drop('col1')
df[cols] = np.where(df[cols].eq(0), 'NA', 0)
print (df)
   col1 col2 col3 col4
0   1.0    0   NA    0
1   2.0    0    0    0
2   NaN    0   NA
3   4.0    0    0    0

If need processig all columns without first column:

df.iloc[:, 1:] = np.where(df.iloc[:, 1:].eq(0), 'NA', 0)
print (df)
   col1 col2 col3 col4
0   1.0    0   NA    0
1   2.0    0    0    0
2   NaN   NA    0   NA
3   4.0    0    0    0
  • Related