Home > other >  Pandas dataframe - fillna with last of next month
Pandas dataframe - fillna with last of next month

Time:06-04

I've been staring at this way too long and I think Ive lost my mind, it really shouldn't be as complicated as I'm making it.

I have a df:

Date1 Date2
2022-04-01 2022-06-17
2022-04-15 2022-04-15
2022-03-03 NaT
2022-04-22 NaT
2022-05-06 2022-06-06

I want to fill the blanks in 'Date2' where it keeps the values from 'Date2' if they are present but if 'Date2' is NaT then I want it to be the last date of the subsequent month from 'Date1'.

In the example above, the 2 NaT fields would become:

Date1 Date2
2022-03-03 2022-04-30
2022-04-22 2022-05-31

I know I have to use .fillna and the closest I've come is this:

df['Date2'] = (df['Date2'].fillna((df['Date1']   pd.DateOffset(months=1)).replace)).to_numpy().astype('datetime64[M]')

This returns the first of the month. However, it returns the first of the month for all rows (not just NaT rows) and it is returning the first of the month as opposed to the last of the month.

I'm pretty sure my parenthesis are messed up and I've tried many different combinations of - timedelta and similar.

What am I doing wrong here? TIA!

CodePudding user response:

Your question can be interpreted in two ways given the provided example.

End of month of the next row's date 1 (which now does not seem to be what you want)

You need to use pd.offses.MonthEnd and shift

df['Date2'] = (df['Date2']
               .fillna(df['Date1'].add(pd.offsets.MonthEnd())
                                  .shift(-1))
               )

Next month's end (same row)

If you want the next month end of the same row:

df['Date2'] = (df['Date2']
               .fillna(df['Date1'].add(pd.offsets.MonthEnd(2)))
               )

Output:

       Date1      Date2
0 2022-04-01 2022-06-17
1 2022-04-15 2022-04-15
2 2022-03-03 2022-04-30
3 2022-04-22 2022-05-31
4 2022-05-06 2022-06-06

CodePudding user response:

Use MonthEnd and loc:

from pandas.tseries.offsets import MonthEnd
>>> df.loc[df['Date2'].isnull(), 'Date2'] = df['Date1']   pd.DateOffset(months=1)   MonthEnd(1)

CodePudding user response:

Use MonthEnd with an offset of 2 (current month and next month):

df['Date2'] = df['Date2'].fillna(df['Date1'].add(pd.offsets.MonthEnd(2)))
print(df)

# Output
       Date1      Date2
0 2022-04-01 2022-06-17
1 2022-04-15 2022-04-15
2 2022-03-03 2022-04-30
3 2022-04-22 2022-05-31
4 2022-05-06 2022-06-06
  • Related