Home > database >  Pandas resample based on string like PeriodIndex
Pandas resample based on string like PeriodIndex

Time:11-07

I have a dataframe like as below

df = pd.DataFrame({'subject_id':[1,1,1,1,1,2,2,2,2,2],
                   'qtr_info' :['2017Q1','2017Q3','2017Q4','2018Q1','2018Q4','2017Q1','2017Q4','2018Q2','2018Q4','2019Q1'],
                   'val' :[5,5,5,5,1,6,5,5,8,3],
                   'Prod_id':['A','B','C','A','E','Q','G','F','G','H']})

I would like to do the below

a) Fill all the missing quarters of a subject

b) fillna for other columns using the mean value for respective columns (for the same subject). Don't refer other subject ids for computing mean value

So, I tried the below

df_time.resample('Q').mean()   #didn't work
df_time.groupby('subject_id').resample('Q).mean()  #didn't work

But I got the below error

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'

I expect my output to be like as below for subject_id = 1. Here yellow color shows newly inserted rows (to fill the missing gaps)

enter image description here

CodePudding user response:

Create PeriodIndex first and for resample use Resampler.agg with dictionary, for replace missing values by means use Series.fillna with GroupBy.transform:

df_time['qtr_info'] = pd.PeriodIndex(df_time['qtr_info'], freq='Q')
df = (df_time.set_index('qtr_info')
             .groupby('subject_id')
             .resample('Q')
             .agg({'val':'mean', 'Prod_id':'ffill'}))

df['val'] = df['val'].fillna(df.groupby(level=0)['val'].transform('mean'))
print (df)
                     val Prod_id
subject_id qtr_info             
1          2017Q1    5.0       A
           2017Q2    4.2       A
           2017Q3    5.0       B
           2017Q4    5.0       C
           2018Q1    5.0       A
           2018Q2    4.2       A
           2018Q3    4.2       A
           2018Q4    1.0       E
2          2017Q1    6.0       Q
           2017Q2    5.4       Q
           2017Q3    5.4       Q
           2017Q4    5.0       G
           2018Q1    5.4       G
           2018Q2    5.0       F
           2018Q3    5.4       F
           2018Q4    8.0       G
           2019Q1    3.0       H
  • Related