Home > OS >  Slice a df into windows of 3Y and 1M with a date range Python
Slice a df into windows of 3Y and 1M with a date range Python

Time:04-30

I have a df with a date index as follow:

ind = pd.date_range(start="2015-12-31", end = "2022-04-26", freq="D")
df = pd.DataFrame(
    {
        "col1": range(len(ind))
    },
    index=ind
)

What I need is slice the df in windows from the end of each month from 2017-08-31 to 3 years plus 1 month, so I have the next chunk of code

n = timedelta(365 * 3)   relativedelta(months=1)
fechas_ = pd.date_range("2017-08-31", ind.max() - n, freq="M")

# create a for loop to check the beginning and the end of each window

for i in fechas_:
    print(f"start: {i}")
    print(f"end: {i   n}")
    print("\n")

My problem is that I need the last day of the month as the end of each window e.g.:

# first window
start: 2017-08-31 00:00:00
end: 2020-09-30 00:00:00

# second window
start: 2017-09-30 00:00:00
end: 2020-10-31 00:00:00

# so on

But I'm getting:

# first window
start: 2017-08-31 00:00:00
end: 2020-09-29 00:00:00

# second window
start: 2017-09-30 00:00:00
end: 2020-10-29 00:00:00

# 3
2017-10-31 00:00:00
2020-11-29 00:00:00

# 4
2017-11-30 00:00:00
2020-12-29 00:00:00

# 5
2017-12-31 00:00:00
2021-01-30 00:00:00

# 6
2018-01-31 00:00:00
2021-02-27 00:00:00

# 7
2018-02-28 00:00:00
2021-03-27 00:00:00

# 8
2018-03-31 00:00:00
2021-04-29 00:00:00

# 9
2018-04-30 00:00:00
2021-05-29 00:00:00

# 10
2018-05-31 00:00:00
2021-06-29 00:00:00

# 11
2018-06-30 00:00:00
2021-07-29 00:00:00

# 12
2018-07-31 00:00:00
2021-08-30 00:00:00

# 13
2018-08-31 00:00:00
2021-09-29 00:00:00

# 14
2018-09-30 00:00:00
2021-10-29 00:00:00

# 15
2018-10-31 00:00:00
2021-11-29 00:00:00

# 16
2018-11-30 00:00:00
2021-12-29 00:00:00

# 17
2018-12-31 00:00:00
2022-01-30 00:00:00

# 18
2019-01-31 00:00:00
2022-02-27 00:00:00

# 19
2019-02-28 00:00:00
2022-03-27 00:00:00

Does someone know how can I solve this?

Thanks a lot

CodePudding user response:

In your code

n = timedelta(365 * 3)   relativedelta(months=1)

try replacing it with

n = relativedelta(years=3, months=1, day=31)
  • Related