Home > other >  Iterating through a date range in python that generates a start_date and end_date n days apart from
Iterating through a date range in python that generates a start_date and end_date n days apart from

Time:10-25

Let's say I have an url that looks like this and yields data in json format by specifying a start and end date in the following format: "YYYY-MM-DD".

url = "https://thisisaurl.com/data/asd007?start=2021-10-01&end=2021-10-28"

The maximum number of days for which data can be retrieved via this url is 28. In order to retrieve a year of data, this means that I need to :

  1. loop over a date range that each time generates a start_date and an end_date with (a max of) 28 days in between
  2. feed this to the url
  3. append / collect this data each time the url is called

Let's say I want to retrieve data for the period 2021-05-01 - 2021-10-24, I know how I can loop over this date range with a time delta of 28 days like so:

from datetime import date, timedelta
start_date = date(2021, 5, 1)
end_date = date(2021, 10, 24)
delta = timedelta(days=28)
while start_date <= end_date:
    print(start_date.strftime("%Y-%m-%d"))
    start_date  = delta

2021-05-01
2021-05-29
2021-06-26
2021-07-24
2021-08-21
2021-09-18
2021-10-16

But I struggle how to assign these values as start and end date to the url and how to make sure that the full period is being generated (e.g. 2021-10-24 instead of 2021-10-16). Any ideas?

CodePudding user response:

IIUC, you are trying to add the start and end dates to the url in a loop - which you can do with f-strings, for example:

from datetime import date, timedelta
start_date = date(2021, 5, 1)
end_date = date(2021, 10, 24)
delta = timedelta(days=28)
base_url = "https://thisisaurl.com/data/asd007?"
while start_date <= end_date:
    period_start = start_date.strftime('%Y-%m-%d')
    period_end = (start_date   timedelta(days=27)).strftime('%Y-%m-%d') #Note: 27 days
    print(f'{base_url}start={period_start}&end={period_end}')
    start_date  = delta

Output

https://thisisaurl.com/data/asd007?start=2021-05-01&end=2021-05-28
https://thisisaurl.com/data/asd007?start=2021-05-29&end=2021-06-25
https://thisisaurl.com/data/asd007?start=2021-06-26&end=2021-07-23
https://thisisaurl.com/data/asd007?start=2021-07-24&end=2021-08-20
https://thisisaurl.com/data/asd007?start=2021-08-21&end=2021-09-17
https://thisisaurl.com/data/asd007?start=2021-09-18&end=2021-10-15
https://thisisaurl.com/data/asd007?start=2021-10-16&end=2021-11-12

CodePudding user response:

I'm not sure I understood the question, but does this answer it?

from datetime import date, timedelta

start_date = date(2021, 5, 1)
end_date = date(2021, 10, 24)
delta = timedelta(days=28)
while start_date < end_date:
    end_period = start_date   delta
    if end_period > end_date:
        end_period = end_date
    start_str = start_date.strftime('%Y-%m-%d')
    end_str = end_period.strftime('%Y-%m-%d')
    print('Period from', start_str, 'to', end_str)
    print(f'https://thisisaurl.com/data/asd007?start={start_str}&end={end_str}')
    start_date = end_period   timedelta(days=1)
Period from 2021-05-01 to 2021-05-29
URL: https://thisisaurl.com/data/asd007?start=2021-05-01&end=2021-05-29
Period from 2021-05-30 to 2021-06-27
URL: https://thisisaurl.com/data/asd007?start=2021-05-30&end=2021-06-27
Period from 2021-06-28 to 2021-07-26
URL: https://thisisaurl.com/data/asd007?start=2021-06-28&end=2021-07-26
Period from 2021-07-27 to 2021-08-24
URL: https://thisisaurl.com/data/asd007?start=2021-07-27&end=2021-08-24
Period from 2021-08-25 to 2021-09-22
URL: https://thisisaurl.com/data/asd007?start=2021-08-25&end=2021-09-22
Period from 2021-09-23 to 2021-10-21
URL: https://thisisaurl.com/data/asd007?start=2021-09-23&end=2021-10-21
Period from 2021-10-22 to 2021-10-24
URL: https://thisisaurl.com/data/asd007?start=2021-10-22&end=2021-10-24
  • Related