Home > Enterprise >  pandas add the months between two dates as a new rows
pandas add the months between two dates as a new rows

Time:04-11

i have a dataframe like this:

import pandas as pd
data=[[1,'2021-10-01 00:00:00','2021-12-01 23:59:59'],[2,'2022-02-01 00:00:00','2022-05-10 23:59:59']]
df=pd.DataFrame(data,columns=['id','start_date','end_date'])
df

id  start_date           end_date
1   2021-10-01 00:00:00  2021-12-01 23:59:59
2   2022-02-01 00:00:00  2022-05-10 23:59:59

I want to get the months between start and end dates in a new column.

expected output:

data2=[[1,'2021-10-01 00:00:00','2021-12-01 23:59:59','2021-10'],
       [1,'2021-10-01 00:00:00','2021-12-01 23:59:59','2021-11'],
       [1,'2021-10-01 00:00:00','2021-12-01 23:59:59','2021-12'],
       [2,'2022-02-01 00:00:00','2022-05-10 23:59:59','2022-02'],
       [2,'2022-02-01 00:00:00','2022-05-10 23:59:59','2022-03'],
       [2,'2022-02-01 00:00:00','2022-05-10 23:59:59','2022-04'],
       [2,'2022-02-01 00:00:00','2022-05-10 23:59:59','2022-05']]
df2=pd.DataFrame(data2,columns=['id','start_date','end_date','months'])
df2

id  start_date            end_date             months
1   2021-10-01 00:00:00   2021-12-01 23:59:59  2021-10
1   2021-10-01 00:00:00   2021-12-01 23:59:59  2021-11
1   2021-10-01 00:00:00   2021-12-01 23:59:59  2021-12
2   2022-02-01 00:00:00   2022-05-10 23:59:59  2022-02
2   2022-02-01 00:00:00   2022-05-10 23:59:59  2022-03
2   2022-02-01 00:00:00   2022-05-10 23:59:59  2022-04
2   2022-02-01 00:00:00   2022-05-10 23:59:59  2022-05

is there a way to do this?

CodePudding user response:

Create months periods by period_range and then use DataFrame.explode:

df['months'] = [pd.period_range(s, e, freq='m') 
                                        for s, e in zip(df['start_date'], df['end_date'])]

#alternative
#lambda x: pd.period_range(x['start_date'], x['end_date'], freq='m')
#df['months'] = df.apply(f, axis=1)

df = df.explode('months').reset_index(drop=True)
print (df)
   id           start_date             end_date   months
0   1  2021-10-01 00:00:00  2021-12-01 23:59:59  2021-10
1   1  2021-10-01 00:00:00  2021-12-01 23:59:59  2021-11
2   1  2021-10-01 00:00:00  2021-12-01 23:59:59  2021-12
3   2  2022-02-01 00:00:00  2022-05-10 23:59:59  2022-02
4   2  2022-02-01 00:00:00  2022-05-10 23:59:59  2022-03
5   2  2022-02-01 00:00:00  2022-05-10 23:59:59  2022-04
6   2  2022-02-01 00:00:00  2022-05-10 23:59:59  2022-05
  • Related