My DF looks like below:
column1 column2
2020-11-01 1
2020-12-01 2
2021-01-01 3
NaT 4
NaT 5
NaT 6
Output should be like this:
column1 column2
2020-11-01 1
2020-12-01 2
2021-01-01 3
2021-02-01 4
2021-03-01 5
2021-04-01 6
I can't create next date (only months and years changed) based on the last existing date in df. Is there any pythonic way to do this? Thanks for any help!
Regards Tomasz
CodePudding user response:
This is how I would do it, you could probably tidy this up into more of a one liner but this will help illustrate the process a little more.
#convert to date
df['column1'] = pd.to_datetime(df['column1'], format='%Y-%d-%m')
#create a group for each missing section
df['temp'] = df.column1.fillna(method = 'ffill')
#count the row within this group
df['temp2'] = df.groupby(['temp']).cumcount()
# add month
df['column1'] = [x pd.DateOffset(months=y) for x,y in zip(df['temp'], df['temp2'])]
CodePudding user response:
pandas supports time series data
pd.date_range("2020-11-1", freq=pd.tseries.offsets.DateOffset(months=1), periods=10)
will give
DatetimeIndex(['2020-11-01', '2020-12-01', '2021-01-01', '2021-02-01',
'2021-03-01', '2021-04-01', '2021-05-01', '2021-06-01',
'2021-07-01', '2021-08-01'],
dtype='datetime64[ns]', freq='<DateOffset: months=1>')