I have two dataframes, one dataframe has index with date and time, the other dataframe's index only has date. Now I'd like to filter rows of first dataframe if its date is within 2nd dataframe's index.
I can do it but very complicated to use two for loop to check each item. Is there a simple way?
Here is what I did:
First dataframe
import pandas as pd
df=pd.DataFrame({'value':[1,2,3,4,5,6]})
df.index=['2022-01-01 10:00','2022-01-02 13:00','2022-01-07 10:00','2022-01-08 10:00','2022-01-11 10:00','2022-01-12 10:00']
df.index=pd.to_datetime(df.index)
df
value
2022-01-01 10:00:00 1
2022-01-02 13:00:00 2
2022-01-07 10:00:00 3
2022-01-08 10:00:00 4
2022-01-11 10:00:00 5
2022-01-12 10:00:00 6
2nd dataframe
f_1=pd.DataFrame({'value':[1,2,3]})
df_1.index=['2022-01-02','2022-01-05','2022-01-07']
df_1.index=pd.to_datetime(df_1.index)
df_1
value
2022-01-02 1
2022-01-05 2
2022-01-07 3
Now I am checking each element of first dataframe's index is equal to any element of 2nd dataframe's index. If so, then save value of True
date_same=[False]*len(df.index)
for ix,date1 in enumerate(df.index.date):
for date2 in df_1.index:
if date1==date2:
date_same[ix]=True
break
date_same
[False, True, True, False, False, False]
Now using saved list to filter the first dataframe
df_filter=df.loc[date_same]
df_filter
value
2022-01-02 13:00:00 2
2022-01-07 10:00:00 3
CodePudding user response:
You can directly use np.isin
to create a boolean array and index it:
df.loc[np.isin(df.index.date, df_1.index.date)]