Home > Software design >  Series finds value through direct == comparison but fails on isin() process
Series finds value through direct == comparison but fails on isin() process

Time:01-01

I am trying to filter a dataframe through this method:

filter = (df_1['Year']==2022) & **(df_1['Quarter'].isin(['2022Q3','2022Q4']))**
df_incumbent = df_1[filter].groupby(['Lane','Carrier Scac','Quarter']).agg({'Cust Linehaul':np.mean,'Load Number':'count'}).unstack('Quarter',fill_value='NaN')
df_incumbent

And it returns nothing. However when I do a direct comparison item by item it returns a dataframe.

filter = (df_1['Year']==2022) & (df_1['Quarter']=='2022Q3')

Some more information on the df_1['Quarter']. If it helps with some clue.

df_1['Quarter'].unique()

<PeriodArray>
['2021Q4', '2021Q1', '2021Q3', '2021Q2', '2022Q1', '2022Q2', '1970Q1',
    'NaT', '2022Q3', '2022Q4']
Length: 10, dtype: period[Q-DEC]

I have used isin successfully by passing a list of items, but not sure why this one is not working. I tried to go through other questions, but not able to still solve it.

CodePudding user response:

Looks like isin only works on list with the same dtype, try create PeriodIndex first:

df_1['Quarter'].isin(pd.PeriodIndex(['2022Q3','2022Q4'], freq='Q'))

Example:

s = pd.to_datetime(['2022-10-01', '2023-10-02']).to_period('Q-DEC')
s
# PeriodIndex(['2022Q4', '2023Q4'], dtype='period[Q-DEC]')

s.isin(['2022Q4'])
# array([False, False])

s.isin(pd.PeriodIndex(['2022Q4'], freq='Q'))
# array([ True, False])
  • Related