I am trying to import file to PyCharm using pandas here is my code:
import pandas as pd
data=pd.read_csv(r'C:\Users\agns1\Downloads\data_work_final.csv')
sadly I'm getting this error:
File "pandas\_libs\parsers.pyx", line 542, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 642, in pandas._libs.parsers.TextReader._get_header
File "pandas\_libs\parsers.pyx", line 843, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas\_libs\parsers.pyx", line 1917, in pandas._libs.parsers.raise_parser_error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xef in position 4: invalid continuation
byte
any thoughts on how can I fix this ?
CodePudding user response:
You need to check the file encoding:
with open(r'C:\Users\agns1\Downloads\data_work_final.csv', 'rb') as rawdata:
result = chardet.detect(rawdata.read(10000))
print(result)
You'll get something like:
{'encoding': <'the actual encoding'>, 'confidence': xxx, 'language': xxxx}
Then do:
data=pd.read_csv(r'C:\Users\agns1\Downloads\data_work_final.csv', encoding='<'the actual encoding'>')