Home > Back-end >  How can I create a list of date ranges, starting from today going back every 3 months?
How can I create a list of date ranges, starting from today going back every 3 months?

Time:07-25

Background:

I am using pandas to append to an already-existing table in a SQL database. The issue is that the sql database I am connecting to is larger than what pandas can handle. To get around this, I am going to access the table in chunks. The SQL query will contain a WHERE clause, and I want to iterate through it using a list of date ranges.

This is my example SQL query (simplified):

SELECT * FROM (SELECT col1, col2, col3 ... FROM table)

WHERE DATE(server_time) >= min_date AND DATE(server_time) <= max_date

I want to iterate through a list of date ranges to call this query This is how the server_time field is formatted:

'2022-03-29 20:08:57'

So in order to filter the dates, I need to create this list of date ranges, and I am struggling with this.

This is what I tried:

num_months = 3
today = datetime.datetime.today()
date_list = [today - datetime.timedelta(months=x) for x in range(num_months)]

I tried creating a list of date ranges varying by days instead of months, and it worked. I did this by changing (months=x) to (days=x). When I run the code above, however, I get 'months' is an invalid keyword argument for __new__(). Any ideas? Thanks so much in advance!!

CodePudding user response:

With , you could use:

import numpy as np

N = 5 # number of values
pd.Timestamp('today')-pd.DateOffset(months=1)*np.arange(N)

output:

array([Timestamp('2022-07-25 01:23:45'),
       Timestamp('2022-06-25 01:23:45'),
       Timestamp('2022-05-25 01:23:45'),
       Timestamp('2022-04-25 01:23:45'),
       Timestamp('2022-03-25 01:23:45')], dtype=object)

As Series:

N = 5
pd.Series(pd.Timestamp('2022-07-25 01:23:45')-pd.DateOffset(months=1)*np.arange(5))

output:

0   2022-07-25 01:23:45
1   2022-06-25 01:23:45
2   2022-05-25 01:23:45
3   2022-04-25 01:23:45
4   2022-03-25 01:23:45
dtype: datetime64[ns]
programatically determine the number of periods:

going up to the END date (but not before):

START = pd.Timestamp('today')
END = pd.Timestamp('2022-01-01')
N = (START.to_period('M')-END.to_period('M')).n

pd.Series(START-pd.DateOffset(months=1)*np.arange(N 1))

output:

0   2022-07-25 01:23:45
1   2022-06-25 01:23:45
2   2022-05-25 01:23:45
3   2022-04-25 01:23:45
4   2022-03-25 01:23:45
5   2022-02-25 01:23:45
6   2022-01-25 01:23:45
dtype: datetime64[ns]

CodePudding user response:

During creation of datetime.timedelta there are following arguments and default values thereof

(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)

As you might observe there is not months. Instances of datetime.datetime have .replace method which you might find useful, consider following example

import datetime
d0 = datetime.datetime(2022,3,28,20,8,57)
d1 = d0.replace(month=2)
d2 = d0.replace(month=1)
print(d0)  # 2022-03-28 20:08:57
print(d1)  # 2022-02-28 20:08:57
print(d2)  # 2022-01-28 20:08:57

Beware that not all months have equal number of days.

  • Related