Home > database >  missing month count from the datetime column in pandas dataframe
missing month count from the datetime column in pandas dataframe

Time:10-20

Existing Dataframe :

      Id         Date_of_activity   
    
       A        2020-09-17 12:36:00     
       A        2020-11-02 00:00:00     
       A        2020-12-02 00:00:00     
       A        2021-01-02 00:00:00     
       A        2021-02-02 00:00:00 
       A        2021-03-03 12:12:00     
       A        2021-04-03 12:12:00     
    
       B        2020-11-02 00:00:00         
       B        2021-01-02 00:00:00         
       B        2021-03-03 12:12:00     
       B        2021-04-03 12:12:00

Expected Dataframe :

  Id      Missed_Month_Count
  A              1
  B              2

I am looking to calculate the Number of Missed Months where NO activity was Done. For Id A , No activity was done in 10th Month of 2020 so the missed month count should be 1 , likewise for B , No activity was done in 12th month of 2020 and 2nd month of 2021 , which makes missed_month_count as 2.

CodePudding user response:

You can use:

# convert to Monthly period
s = pd.to_datetime(df['Date_of_activity']).dt.to_period('M')

# compute the difference per group
# if != 1, then there is a missing month
out = (s.sort_values()
        .groupby(df['Id'], sort=False)
        .apply(lambda g: g.drop_duplicates().diff().ne('M').sum()-1)
        .reset_index(name='Missed_Month_Count')
      )

output:

  Id  Missed_Month_Count
0  A                   1
1  B                   2
  • Related