Replace "NaT" with next date based on previous date-CodePudding

My DF looks like below:

column1      column2
2020-11-01   1
2020-12-01   2
2021-01-01   3
NaT          4
NaT          5
NaT          6

Output should be like this:

column1      column2
2020-11-01   1
2020-12-01   2
2021-01-01   3
2021-02-01   4
2021-03-01   5
2021-04-01   6

I can't create next date (only months and years changed) based on the last existing date in df. Is there any pythonic way to do this? Thanks for any help!

Regards Tomasz

CodePudding user response：

This is how I would do it, you could probably tidy this up into more of a one liner but this will help illustrate the process a little more.

#convert to date
df['column1'] = pd.to_datetime(df['column1'], format='%Y-%d-%m')

#create a group for each missing section 
df['temp'] = df.column1.fillna(method = 'ffill')

#count the row within this group
df['temp2'] = df.groupby(['temp']).cumcount()

# add month
df['column1'] = [x   pd.DateOffset(months=y) for x,y in zip(df['temp'], df['temp2'])]

CodePudding user response：

pandas supports time series data

pd.date_range("2020-11-1", freq=pd.tseries.offsets.DateOffset(months=1), periods=10)

will give

DatetimeIndex(['2020-11-01', '2020-12-01', '2021-01-01', '2021-02-01',
               '2021-03-01', '2021-04-01', '2021-05-01', '2021-06-01',
               '2021-07-01', '2021-08-01'],
              dtype='datetime64[ns]', freq='<DateOffset: months=1>')