I'm having an error reading a pandas file of type excel.When I put it to pandas to read an excel file it is showing me this error in the dates, I checked the file is in iso format the dates. What should I do?
import pandas as pd
df = pd.read_excel('./Inputs/fileXXX.xlsx')
c:\Users\XXXXXXXXXXX\Anaconda3\lib\site-packages\openpyxl\worksheet\_reader.py in parse_cell(self, element)
204 data_type = 'd'
205 try:
--> 206 value = from_excel(
207 value, self.epoch, timedelta=style_id in self.timedelta_formats
208 )
c:\Users\XXXXXXXXXXX\Anaconda3\lib\site-packages\openpyxl\utils\datetime.py in from_excel(value, epoch, timedelta)
120 if 0 < value < 60 and epoch == WINDOWS_EPOCH:
121 day = 1
--> 122 return epoch datetime.timedelta(days=day) diff
123
124
OverflowError: date value out of range
CodePudding user response:
This similar question looks helpful. One option is to just read in every column as strings with something like this:
df = pd.read_excel('./Inputs/fileXXX.xlsx', dtype='string')
Alternatively, you can choose to read only the column you know is causing the error in as a string, leaving the others be.
df = pd.read_excel('./Inputs/fileXXX.xlsx', converters={'date_column':str})
CodePudding user response:
For the solution of this problem, we must find the errors in our Base. So I went directly to the files in the openpyxl
library and filtered that path:
openpyxl/worksheet/_reader.py
Look for this part of the code:
except ValueError:
msg = """Cell {0} is marked as a date but the serial value {1} is outside the limits for dates. The cell will be treated as an error.""".format(coordinate, value)
Swap for this:
except(OverflowError, ValueError):
msg = f"""Cell {coordinate} is marked as a date but the serial value {value} is outside the limits for dates. The cell will be treated as an error."""
After that run your Python code, to find the error in your base. That's how you can handle it.
Font:Solution Font