Home > front end >  How to exclude future dates from excel data file using pandas?
How to exclude future dates from excel data file using pandas?

Time:05-15

I'm trying to limit my dataset to dates before today. Below creates a graph but the mask doesn't have any impact. Any help appreciated.

df = pd.read_excel("./data/Weight.xlsx", sheet_name='Data')
    
date = df.iloc[:, 0]
kilos = df.iloc[:, 3]

date = pd.to_datetime(date, format='%Y-%m-%d')
 
mask = (date < pd.to_datetime('today'))
df = df.loc[mask]
df.head()

   2021-02-06 00:00:00    18    2  115.208
42          2022-12-11  14.0  4.0   90.716
43          2022-12-18  14.0  0.0   88.900
44          2022-12-25  14.0  0.0   88.900
45          2023-01-01  14.0  2.0   89.808
46          2023-01-08  14.0  1.0   89.354

CodePudding user response:

please check if you have a header in your excel file. The df may not have a header. Also, assuming the numbers - 41, 42, ... are all indeces. This appears to work fine with making header=None while reading the file.

Input excel is here Input

The updated code is here.

df = pd.read_excel("input.xlsx", sheet_name='Data', header=None)
    
date = df.iloc[:, 0]
kilos = df.iloc[:, 3]

date = pd.to_datetime(date, format='%Y-%m-%d')
 
mask = (date < pd.to_datetime('today'))
df = df.loc[mask]

Output looks fine. Here is what I am getting...

>> df
    0   1   2   3
0   2021-02-06  18  2   115.208

>> mask
0     True
1    False
2    False
3    False
4    False
5    False
Name: 0, dtype: bool

CodePudding user response:

Thanks, that fixed it for me.

Chris

  • Related