I am trying to read this file using read_csv in pandas(python). But I am not able to capture all columns. Can you help?
Here is the code:
file = r'path of file'
df = pd.read_csv(file, encoding='cp1252', on_bad_lines='skip')
Thank you
CodePudding user response:
I tried to read your file, and I first noticed that the encoding you specified does not correspond to the one used in your file. I also noticed that the separator is not a comma (,
) but a tab (\t
).
First, to get the file encoding (in linux), you just need to run:
$ file -i kopie.csv
kopie.csv: text/plain; charset=utf-16le
In Python:
import pandas as pd
path_to_file = 'kopie.csv'
df = pd.read_csv(path_to_file, encoding='utf-16le', sep='\t')
And when I print the shape of the loaded dataframe:
>>> df.shape
(869, 161)