Home > Enterprise >  pd.read_excel ValueError: File is not a recognized excel file
pd.read_excel ValueError: File is not a recognized excel file

Time:05-11

I can download the .ashx link into .xls and open it manually in Excel:

import urllib
urllib.request.urlretrieve(
    'https://www.imf.org/-/media/Files/Publications/WEO/WEO-Database/2022/WEOApr2022all.ashx',
    'weo.xls'
)

But when I try reading it with pandas:

import pandas as pd
pd.read_excel('weo.xls')

it gives error:

Traceback (most recent call last):

File "", line 1, in pd.read_excel('weo.xls')

File "C:\Anaconda3\lib\site-packages\pandas\util_decorators.py", line 299, in wrapper return func(*args, **kwargs)

File "C:\Anaconda3\lib\site-packages\pandas\io\excel_base.py", line 336, in read_excel io = ExcelFile(io, storage_options=storage_options, engine=engine)

File "C:\Anaconda3\lib\site-packages\pandas\io\excel_base.py", line 1071, in init ext = inspect_excel_format(

File "C:\Anaconda3\lib\site-packages\pandas\io\excel_base.py", line 965, in inspect_excel_format raise ValueError("File is not a recognized excel file")

ValueError: File is not a recognized excel file

CodePudding user response:

It is a csv file separated by tabs and is not exactly a straight forward dataframe format:

pd.read_csv('weo.csv', sep='\t', encoding='utf_16_le')
  • Related