I'm trying to read a csv file with the following values [23, " 5,000.00 ", A]
I need to read the number with a comma, but I'm not able to do it even if I pass a decimal and a quotechar.
I'm reading a CSV that df.read_csv(path, delimiter=',', quotechar='"', decimal=',', thousands='.', dtype=str)
I'm getting the following output if I run df.values.tolist()
=> ['23', ' " 5', '000.00 "', 'A']
What might be the problem?
The column with this type of data is always different. So I cannot transform it based on column name.
dtype=str
should be left as a string, as it's necessary for the future.
CodePudding user response:
You have to set skipinitialspace's value.
I have the following code:
import pandas as pd
result = pd.read_csv("data.csv", sep=',', quotechar='"', skipinitialspace=True)
print(result.columns)
And my data.csv looks like this: 23; " 5,000.0 "; A
Running the code above give this result: Index([u'23', u' 5,000.0 ', u'A'], dtype='object')
skipinitialspace will remove strip off the extra spaces from " 5,000.0 ". If you try to remove the extra spaces by hand, then you can leave out this parameter and it will work just fine as well.