Home > front end >  remove the 29th of February from a dataframe with date index
remove the 29th of February from a dataframe with date index

Time:09-23

I have this dataframe:

1/1/1990,1.9
1/2/1990,1.9
1/29/1990,1.9
1/4/1990,1.7775
1/5/1990,1.76
1/6/1990,1.76
1/7/1990,1.76
1/8/1990,1.76
1/1/1991,1.9
1/2/1991,1.9
1/29/1991,1.9
1/4/1991,1.7775
2/5/1991,1.76
2/6/1991,1.76
1/7/1991,1.76
3/29/1991,1.76
4/30/1991,1.76

proxy of a bigger database.

I would like to drop all the data referring to the 29th of February.

This is how I read the dataframe:

dfr = pd.read_csv('test.csv', sep=',', index_col=0, parse_dates=True)

this is the best solution that I have found so far:

dfr = dfr.loc[~(dfr.index.month==2 & dfr.index.day==29)]

However, I get the following error:

TypeError: unsupported operand type(s) for &: 'int' and 'Int64Index'

It is strange, because dfr.index.month==2 as well as dfr.index.day==29 work. I have the feeling that they have to be converted to pandas date but I do not know how.

CodePudding user response:

Your parentheses are incorrect as & has higher precedence than ==.

Your expression is equivalent to ~(dfr.index.month == (2 & dfr.index.day) == 29), which triggers the error unsupported operand type(s) for &: 'int' and 'Int64Index'.

You need to use:

dfr = dfr.loc[~((dfr.index.month==2) & (dfr.index.day==29))]

CodePudding user response:

You may also use strftime for a solution without hassling of parantheses:

dfr[dfr.index.strftime('%m-%d') != '02-29']
  • Related