Home > Mobile >  How do I read NaN (Sodium Nitride) in pandas from csv as a string instead of NaN (Not a Number)?
How do I read NaN (Sodium Nitride) in pandas from csv as a string instead of NaN (Not a Number)?

Time:01-26

I am studying material informatics in python. I want to treat NaN (Sodium Nitride) as a chemical formula as a string, but it is taken as NaN (Not a Number).

import pandas as pd

df = pd.read_csv('sample.csv', dtype={'formula': str})
print(df.loc[0]['formula'])
# >> nan
print(type(df.loc[0]['formula']))
# >> float

The csv file to be read is as follows

id,formula
1,NaN
2,NaHCO3

CodePudding user response:

By defaut, read_csv recognizes the following strings as NaN:

''
'#N/A'
'#N/A N/A'
'#NA'
'-1.#IND'
'-1.#QNAN'
'-NaN'
'-nan'
'1.#IND'
'1.#QNAN'
'<NA>'
'N/A'
'NA'
'NULL'
'NaN'
'n/a'
'nan'
'null'

Use the na_values=[''], keep_default_na=False option:

df = pd.read_csv('sample.csv', na_values=[''], keep_default_na=False,
                 dtype={'formula': str})
  • Related