I'm trying to generate a list of start dates which I'll use to scrape google trends. I need the start dates 3 hours apart, and then I'll generate end dates based on the start date in 4 hour increments, so end date overlaps the next start date by 1 hour.
from datetime import datetime, timedelta, date
import pandas as pd
import time
start='2018-06-05T01'
end='2020-11-01T23'
start_date = datetime.strptime(start, '%Y-%m-%dT%H')
end_date = datetime.strptime(end, '%Y-%m-%dT%H')
delta = timedelta(hours=3)
while True:
date_list = []
date_list.append(start_date delta)
if start_date >= end:
break
This does not seem to work, and I'm not sure how to fix it since I'm not sure how to keep looping until the end date is hit.
CodePudding user response:
Since you're using pandas
anyway, try with date_range
:
start_date = pd.to_datetime(start, format='%Y-%m-%dT%H')
end_date = pd.to_datetime(end, format='%Y-%m-%dT%H')
date_list = pd.date_range(start_date, end_date, freq="3H")
>>> date_list
DatetimeIndex(['2018-06-05 01:00:00', '2018-06-05 04:00:00',
'2018-06-05 07:00:00', '2018-06-05 10:00:00',
'2018-06-05 13:00:00', '2018-06-05 16:00:00',
'2018-06-05 19:00:00', '2018-06-05 22:00:00',
'2018-06-06 01:00:00', '2018-06-06 04:00:00',
...
'2020-10-31 19:00:00', '2020-10-31 22:00:00',
'2020-11-01 01:00:00', '2020-11-01 04:00:00',
'2020-11-01 07:00:00', '2020-11-01 10:00:00',
'2020-11-01 13:00:00', '2020-11-01 16:00:00',
'2020-11-01 19:00:00', '2020-11-01 22:00:00'],
dtype='datetime64[ns]', length=7048, freq='3H')
If you don't want this to be a DatetimeIndex, you can use:
date_list = pd.date_range(start_date, end_date, freq="3H").tolist()
CodePudding user response:
Your code assigns an empty list to date_list
and start_date
is not changed in every iteration. The end
variable is a string, not a datetime like end_date
.
CodePudding user response:
As user5401398 pointed out, you should
- Move the
date_list
outside the loop - Update
start_date
in the loop - Compare with the
end_date
instead of theend
variable, which is a string.
A modified version is in the below.
from datetime import datetime, timedelta, date
start='2018-06-05T01'
end='2020-11-01T23'
start_date = datetime.strptime(start, '%Y-%m-%dT%H')
end_date = datetime.strptime(end, '%Y-%m-%dT%H')
delta = timedelta(hours=3)
date_list = [start_date]
while True:
start_date = delta
date_list.append(start_date)
if start_date >= end_date:
break
print(date_list)