I am trying to drop rows at specific minutes ( 05,10, 20 ) I have datetime as an index
df5['Year'] = df5.index.year
df5['Month'] = df5.index.month
df5['Day']= df5.index.day
df5['Day_of_Week']= df5.index.day_name()
df5['hour']= df5.index.strftime('%H')
df5['Min']= df5.index.strftime('%M')
df5
Then I run below
def clean(df5):
for i in range(len(df5)):
hour = pd.Timestamp(df5.index[i]).hour
minute = pd.Timestamp(df5.index[i]).minute
if df5 = df5[(df5.index.minute ==5) | (df5.index.minute == 10)| (df5.index.minute == 20)]
df.drop(axis=1, index=i, inplace=True)
it returnes invalid syntax error.
CodePudding user response:
Here looping is not necessary, also not recommended.
Use DatetimeIndex.minute
with Index.isin
and inverted mask by ~
filtering in boolean indexing
:
df5 = df5[~df5.index.minute.isin([5, 10, 20])]
For reuse column df5['Min']
use strings values:
df5 = df5[~df5['Min'].isin(['05', '10', '20'])]
All together:
def clean(df5):
return df5[~df5.index.minute.isin([5, 10, 20])]
CodePudding user response:
You can just do it using boolean indexing, assuming that the index is already parsed as datetime.
df5 = df5[~((df5.index.minute == 5) | (df5.index.minute == 10) | (df5.index.minute == 20))]
Or the opposite of the same answer:
df5 = df5[(df5.index.minute != 5) | (df5.index.minute != 10) | (df5.index.minute != 20)]
CodePudding user response:
Generally speaking, the right synthax to combine a logic OR
inside an IF
statement is the following:
today = 'Saturday'
if today=='Sunday' OR today=='Saturday':
print('Today is off. Rest at home')
In your case, you should probably use something like this:
if df5 == df5[(df5.index.minute ==5)] OR df5[(df5.index.minute ==10)]
......
FINAL NOTE:
You made some mistakes using ==
and =
In Python (and many other programming languages), a single equal mark =
is used to assign a value to a variable, whereas two consecutive equal marks ==
is used to check whether 2 expressions give the same value .
= is an assignment operator
== is an equality operator