Home > Blockchain >  How to find the number of days in each month between two date in different years
How to find the number of days in each month between two date in different years

Time:01-26

I'm trying to get the number of day between two days but per each month. I found some answers but I can't figure out how to do it when the dates have two different years.

For example, I have this dataframe:

df = {'Id': ['1','2','3','4','5'],
      'Item': ['A','B','C','D','E'],
        'StartDate': ['2019-12-10', '2019-12-01', '2019-01-01', '2019-05-10', '2019-03-10'],
        'EndDate': ['2020-01-30' ,'2020-02-02','2020-03-03','2020-03-03','2020-02-02']
        }
df = pd.DataFrame(df,columns= ['Id', 'Item','StartDate','EndDate'])

And I want to get this dataframe:

Expected O/P:

CodePudding user response:

s = (df[["StartDate", "EndDate"]]
       .apply(lambda row: pd.date_range(row.StartDate, row.EndDate), axis=1)
       .explode())

new = (s.groupby([s.index, s.dt.year, s.dt.month])
        .count()
        .unstack(level=[1, 2], fill_value=0))

new.columns = new.columns.map(lambda c: f"{c[0]}-{str(c[1]).zfill(2)}")
new = new.sort_index(axis="columns")
  • get all the dates in between StartDate and EndDate per row, and explode that list of dates to their own rows
  • group by the row id, year and month & count records
  • unstack the year & month identifier to be on the columns side as a multiindex
  • join that year & month values with a hypen in between (also zerofill months, e.g., 03)
  • lastly sort the year-month pairs on columns

to get

>>> new

   2019-11  2019-12  2020-01  2020-02  2020-03
0        0       22       30        0        0
1        0       31       31        2        0
2        0       31       31       29        3
3       21       31       31       29        3
4        9       31       31        2        0
  • Related