I have this dataframe:
1/1/1990,1.9
1/2/1990,1.9
1/29/1990,1.9
1/4/1990,1.7775
1/5/1990,1.76
1/6/1990,1.76
1/7/1990,1.76
1/8/1990,1.76
1/1/1991,1.9
1/2/1991,1.9
1/29/1991,1.9
1/4/1991,1.7775
2/5/1991,1.76
2/6/1991,1.76
1/7/1991,1.76
3/29/1991,1.76
4/30/1991,1.76
proxy of a bigger database.
I would like to drop all the data referring to the 29th of February.
This is how I read the dataframe:
dfr = pd.read_csv('test.csv', sep=',', index_col=0, parse_dates=True)
this is the best solution that I have found so far:
dfr = dfr.loc[~(dfr.index.month==2 & dfr.index.day==29)]
However, I get the following error:
TypeError: unsupported operand type(s) for &: 'int' and 'Int64Index'
It is strange, because dfr.index.month==2
as well as dfr.index.day==29
work. I have the feeling that they have to be converted to pandas date but I do not know how.
CodePudding user response:
Your parentheses are incorrect as &
has higher precedence than ==
.
Your expression is equivalent to ~(dfr.index.month == (2 & dfr.index.day) == 29)
, which triggers the error unsupported operand type(s) for &: 'int' and 'Int64Index'
.
You need to use:
dfr = dfr.loc[~((dfr.index.month==2) & (dfr.index.day==29))]
CodePudding user response:
You may also use strftime
for a solution without hassling of parantheses:
dfr[dfr.index.strftime('%m-%d') != '02-29']