I had thought this can make all values are read as string, but it doesn't:
df = pd.read_csv(file, sep='\t', dtype=str, low_memory=False)
Because when I do this;
for index, row in df.iterrows():
id_value = row['id']
...
My error message says that 'id_value' is a float, which can't do str concatenation.
Why can't dtype=str achieve that in dataframe?
CodePudding user response:
According to the read_csv
documentation, you have to set both dtype=str
and na_values=""
Use str or object together with suitable na_values settings to preserve and not interpret dtype.
NaN is a float type (unless covering to the new pandas.NA), so if you have missing values, this is likely the origin of your error.
Also, I am not sure which operation you want to do, but if you make it vectorial (i.e. not using iterrows
) this should handle the NaNs automatically.