I have a dataset with data formatted like this:
Name,Code
Mozambique,MZ
Myanmar,MM
Namibia,NA
Nauru,NR
Nepal,NP
Netherlands,NL
I'm loading this data into a database by first converting the CSV file to a dictionary.
I use the following command to perform the conversion:
dict_from_csv = pd.read_csv('test.csv', header=0, index_col=0, squeeze=True).to_dict()
When I do this the value for item with key Namibia is evaluated as nan
I created a small test harness to validate this
import pandas as pd
dict_from_csv = pd.read_csv('test.csv', header=0, index_col=0, squeeze=True).to_dict()
print(dict_from_csv)
The results of running this is:
{'Mozambique': 'MZ', 'Myanmar': 'MM', 'Namibia': nan, 'Nauru': 'NR', 'Nepal': 'NP', 'Netherlands': 'NL'}
Since I'm inserting this information into a database table with a NOT NULL constraint this obviously doesn't work.
I've tried wrapping the the NA in the data file in double quotes and end up with the same results.
If I wrap the NA in the data file in single quotes, it does convert to a string correctly but is stored in the dictionary with the enclosing single quotes.
CodePudding user response:
pandas.read_csv has a parameter keep_default_na
. Setting this to False
solves the problem.
The corrected line is:
dict_from_csv = pd.read_csv(
'test.csv', header=0, index_col=0, squeeze=True, keep_default_na=False
).to_dict()
CodePudding user response:
please upload your dataset(test.csv)