Home > Net >  Group date column into n-day periods
Group date column into n-day periods

Time:02-25

I need a function that groups date column into n-day periods with respect to some start and end dates (1 year interval). To assign a quarter (~90 day period) to every date in the data frame I used the code below, which is not very neat (and I want to reuse it for 30-day period as well)

def get_quarter(row, start_date, col_name):
#     date = row['TRN_DT']
    date = row[col_name]
    if date >= start_date and date <= start_date   timedelta(days = 90):
        return 0
    if date > start_date   timedelta(days = 90) and date <= start_date   timedelta(180):
        return 1
    if date > start_date   timedelta(180) and date <= start_date   timedelta(270):
        return 2
    return 3

It basically checks row by row which interval current date belongs to. I was wondering whether there is a better way to do this. pandas.Series.dt.to_period() will not do since it uses a calendar year as a reference --start 01.Jan, end 31.Dec; that is, 16.Jan.XXXX will always be in Q1; what I want is for 16.Jan to be in Q3 if the start date is 16-Jun. Thanks

CodePudding user response:

FTR, a possible solution is to shift every date in the series according to the start_date, to simulate that start_date is the beginning of the year:

>>> start_date = pd.to_datetime("2021-06-16")
>>> dates_series = pd.Series([pd.to_datetime("2020-01-16"), pd.to_datetime("2020-04-16")], name="dates")
0    1
1    2
Name: dates, dtype: int64

We calculate the difference between the current date and the beginning of the same year.

>>> offset = start_date - start_date.replace(month=1, day=1)
>>> offset
166 days 00:00:00

We move all of our dates by the same offser to calculate the "new quarter"

>>> (dates - offset).dt.quarter
0    3
1    4
Name: dates, dtype: int64
  • Related