Im trying to read a CSV file that contains raw data that I want to sanitize.
If i try to run this file:
#1)Read CSV File
file = 'data/opel_corsa_01.csv'
df = pd.read_csv(file,error_bad_lines=False, engine ='python')
I get the following error:
Error tokenizing data. C error: Expected 5 fields in line 7, saw 7
I think its because some columns dont have any data in them?
After looking online, i saw various solutions but didnt seem to fix my problem. Such as:
#1)Read CSV File
file = 'data/opel_corsa_01.csv'
df = pd.read_csv(file,error_bad_lines=False, engine ='python')
Would anyone know how to fix that?
CodePudding user response:
Have you tried to specify the separator? and also encoding:
df = pd.read_csv(file, sep=";", encoding="latin1", error_bad_lines=False, engine ='python')
separator: from your image, it appears that you have ";" as separator of fields
encoding: latin1
is kind of the most generous.