I have a dataframe with a datetime index, and I'd like to get the number of minutes remaining until 4:00 PM (or 16:00) for each row's day, using a column calculation.
Using the answer from this post, we can create an empty dataframe with some random datetime's and assign it to the index:
def random_datetimes_or_dates(start, end, out_format='datetime', n=10):
(divide_by, unit) = (10**9, 's') if out_format=='datetime' else (24*60*60*10**9, 'D')
start_u = start.value//divide_by
end_u = end.value//divide_by
return pd.to_datetime(np.random.randint(start_u, end_u, n), unit=unit)
start = pd.to_datetime('2019-01-01')
end = pd.to_datetime('2021-12-31')
index = random_datetimes_or_dates(start, end, out_format='datetime')
df = pd.DataFrame(index=index)
As an example, if the datetime at index n is 2021-11-29 15:30:00
, then the value in the new column for that row should read 30
. If it's after 16:00, its ok for the number to be negative.
What I had initially tried was this:
df['Minutes_Until_4PM'] = datetime.strptime("1600", "%H%M").time() - df.index.time()
...but this gives the error:
TypeError: 'numpy.ndarray' object is not callable
...which is fine, but I'm not even sure I'm going about this the right way, and this error might just be because of the reproducible code I've provided, but you get what I'm trying to do, WITHOUT using a for
loop.
CodePudding user response:
One option would be to get your datetimes into timestamps (unit in seconds), take a modulo of the number of seconds in a day (to only keep the number of seconds since midnight), subtract that from the number of seconds between midnight and 4pm, and then divide by 60 to get the number of minutes:
df['Minutes_Until_4PM'] = ((16 * 60 * 60) - df.timestamp.mod(24 * 60 * 60)) // 60
Note that the // will floor the division, which might not be the type of behaviour you're looking for...