Let's say I have an url that looks like this and yields data in json format by specifying a start and end date in the following format: "YYYY-MM-DD".
url = "https://thisisaurl.com/data/asd007?start=2021-10-01&end=2021-10-28"
The maximum number of days for which data can be retrieved via this url is 28. In order to retrieve a year of data, this means that I need to :
- loop over a date range that each time generates a
start_date
and anend_date
with (a max of) 28 days in between - feed this to the url
- append / collect this data each time the url is called
Let's say I want to retrieve data for the period 2021-05-01
- 2021-10-24
, I know how I can loop over this date range with a time delta of 28 days like so:
from datetime import date, timedelta
start_date = date(2021, 5, 1)
end_date = date(2021, 10, 24)
delta = timedelta(days=28)
while start_date <= end_date:
print(start_date.strftime("%Y-%m-%d"))
start_date = delta
2021-05-01
2021-05-29
2021-06-26
2021-07-24
2021-08-21
2021-09-18
2021-10-16
But I struggle how to assign these values as start and end date to the url and how to make sure that the full period is being generated (e.g. 2021-10-24 instead of 2021-10-16). Any ideas?
CodePudding user response:
IIUC, you are trying to add the start and end dates to the url in a loop - which you can do with f-strings
, for example:
from datetime import date, timedelta
start_date = date(2021, 5, 1)
end_date = date(2021, 10, 24)
delta = timedelta(days=28)
base_url = "https://thisisaurl.com/data/asd007?"
while start_date <= end_date:
period_start = start_date.strftime('%Y-%m-%d')
period_end = (start_date timedelta(days=27)).strftime('%Y-%m-%d') #Note: 27 days
print(f'{base_url}start={period_start}&end={period_end}')
start_date = delta
Output
https://thisisaurl.com/data/asd007?start=2021-05-01&end=2021-05-28
https://thisisaurl.com/data/asd007?start=2021-05-29&end=2021-06-25
https://thisisaurl.com/data/asd007?start=2021-06-26&end=2021-07-23
https://thisisaurl.com/data/asd007?start=2021-07-24&end=2021-08-20
https://thisisaurl.com/data/asd007?start=2021-08-21&end=2021-09-17
https://thisisaurl.com/data/asd007?start=2021-09-18&end=2021-10-15
https://thisisaurl.com/data/asd007?start=2021-10-16&end=2021-11-12
CodePudding user response:
I'm not sure I understood the question, but does this answer it?
from datetime import date, timedelta
start_date = date(2021, 5, 1)
end_date = date(2021, 10, 24)
delta = timedelta(days=28)
while start_date < end_date:
end_period = start_date delta
if end_period > end_date:
end_period = end_date
start_str = start_date.strftime('%Y-%m-%d')
end_str = end_period.strftime('%Y-%m-%d')
print('Period from', start_str, 'to', end_str)
print(f'https://thisisaurl.com/data/asd007?start={start_str}&end={end_str}')
start_date = end_period timedelta(days=1)
Period from 2021-05-01 to 2021-05-29
URL: https://thisisaurl.com/data/asd007?start=2021-05-01&end=2021-05-29
Period from 2021-05-30 to 2021-06-27
URL: https://thisisaurl.com/data/asd007?start=2021-05-30&end=2021-06-27
Period from 2021-06-28 to 2021-07-26
URL: https://thisisaurl.com/data/asd007?start=2021-06-28&end=2021-07-26
Period from 2021-07-27 to 2021-08-24
URL: https://thisisaurl.com/data/asd007?start=2021-07-27&end=2021-08-24
Period from 2021-08-25 to 2021-09-22
URL: https://thisisaurl.com/data/asd007?start=2021-08-25&end=2021-09-22
Period from 2021-09-23 to 2021-10-21
URL: https://thisisaurl.com/data/asd007?start=2021-09-23&end=2021-10-21
Period from 2021-10-22 to 2021-10-24
URL: https://thisisaurl.com/data/asd007?start=2021-10-22&end=2021-10-24