I am trying to filter a dataframe through this method:
filter = (df_1['Year']==2022) & **(df_1['Quarter'].isin(['2022Q3','2022Q4']))**
df_incumbent = df_1[filter].groupby(['Lane','Carrier Scac','Quarter']).agg({'Cust Linehaul':np.mean,'Load Number':'count'}).unstack('Quarter',fill_value='NaN')
df_incumbent
And it returns nothing. However when I do a direct comparison item by item it returns a dataframe.
filter = (df_1['Year']==2022) & (df_1['Quarter']=='2022Q3')
Some more information on the df_1['Quarter']. If it helps with some clue.
df_1['Quarter'].unique()
<PeriodArray>
['2021Q4', '2021Q1', '2021Q3', '2021Q2', '2022Q1', '2022Q2', '1970Q1',
'NaT', '2022Q3', '2022Q4']
Length: 10, dtype: period[Q-DEC]
I have used isin
successfully by passing a list of items, but not sure why this one is not working. I tried to go through other questions, but not able to still solve it.
CodePudding user response:
Looks like isin
only works on list with the same dtype, try create PeriodIndex first:
df_1['Quarter'].isin(pd.PeriodIndex(['2022Q3','2022Q4'], freq='Q'))
Example:
s = pd.to_datetime(['2022-10-01', '2023-10-02']).to_period('Q-DEC')
s
# PeriodIndex(['2022Q4', '2023Q4'], dtype='period[Q-DEC]')
s.isin(['2022Q4'])
# array([False, False])
s.isin(pd.PeriodIndex(['2022Q4'], freq='Q'))
# array([ True, False])