I have an excel file with GEO column:
GEO
-------
EMEA
NA
LA
ASAP
EMEA
NA
NA
But when I read it in python:
df.read_excel(path '\\' file)
It reads "NA" as missing:
GEO
-------
EMEA
LA
ASAP
EMEA
I know how to tell python to consider something else as missing value, but I haven't found how to tell to ignore "NA"
CodePudding user response:
Use na_values
and keep_default_na
according to the documentation of read_excel
:
# This list was built from the default na_values, minus NA
NA_VALUES = ['', '#N/A', '#N/A N/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan',
'1.#IND', '1.#QNAN', '<NA>', 'N/A', 'NULL', 'NaN', 'n/a', 'nan', 'null']
df = pd.read_excelpath '\\' file, na_values=NA_VALUES, keep_default_na=False)
Output:
>>> df
GEO
0 EMEA
1 NA
2 LA
3 ASAP
4 EMEA
5 NA
6 NA