Home > OS >  generate a range of datetime dates
generate a range of datetime dates

Time:11-06

I'm trying to generate a list of start dates which I'll use to scrape google trends. I need the start dates 3 hours apart, and then I'll generate end dates based on the start date in 4 hour increments, so end date overlaps the next start date by 1 hour.

from datetime import datetime, timedelta, date
import pandas as pd
import time

start='2018-06-05T01'
end='2020-11-01T23'

start_date = datetime.strptime(start, '%Y-%m-%dT%H')
end_date = datetime.strptime(end, '%Y-%m-%dT%H')

delta = timedelta(hours=3)

while True:
    date_list = []
    date_list.append(start_date   delta)
    if start_date >= end:
        break

This does not seem to work, and I'm not sure how to fix it since I'm not sure how to keep looping until the end date is hit.

CodePudding user response:

Since you're using pandas anyway, try with date_range:

start_date = pd.to_datetime(start, format='%Y-%m-%dT%H')
end_date = pd.to_datetime(end, format='%Y-%m-%dT%H')
date_list = pd.date_range(start_date, end_date, freq="3H")

>>> date_list
DatetimeIndex(['2018-06-05 01:00:00', '2018-06-05 04:00:00',
               '2018-06-05 07:00:00', '2018-06-05 10:00:00',
               '2018-06-05 13:00:00', '2018-06-05 16:00:00',
               '2018-06-05 19:00:00', '2018-06-05 22:00:00',
               '2018-06-06 01:00:00', '2018-06-06 04:00:00',
               ...
               '2020-10-31 19:00:00', '2020-10-31 22:00:00',
               '2020-11-01 01:00:00', '2020-11-01 04:00:00',
               '2020-11-01 07:00:00', '2020-11-01 10:00:00',
               '2020-11-01 13:00:00', '2020-11-01 16:00:00',
               '2020-11-01 19:00:00', '2020-11-01 22:00:00'],
              dtype='datetime64[ns]', length=7048, freq='3H')

If you don't want this to be a DatetimeIndex, you can use:

date_list = pd.date_range(start_date, end_date, freq="3H").tolist()

CodePudding user response:

Your code assigns an empty list to date_list and start_date is not changed in every iteration. The end variable is a string, not a datetime like end_date.

CodePudding user response:

As user5401398 pointed out, you should

  1. Move the date_list outside the loop
  2. Update start_date in the loop
  3. Compare with the end_date instead of the end variable, which is a string.

A modified version is in the below.

from datetime import datetime, timedelta, date

start='2018-06-05T01'
end='2020-11-01T23'

start_date = datetime.strptime(start, '%Y-%m-%dT%H')
end_date = datetime.strptime(end, '%Y-%m-%dT%H')

delta = timedelta(hours=3)

date_list = [start_date]

while True:
    start_date  = delta
    date_list.append(start_date)
    if start_date >= end_date:
        break
print(date_list)
  • Related