Home > Software engineering >  Condition for determining day or night when using datetime module in python
Condition for determining day or night when using datetime module in python

Time:12-14

I want to seperate a dataset into two subsets. I used the datetime module of python to achieve the task and while using boolean mask I got an error. I am greeted with the error - The truth value of a Series is ambiguous.

I want to seperate a dataset into two subsets. One is of the nighttime and the other is of the daytime and I have decided the criteia as follows

7:00 A.M. to 7:00 P.M. as daytime and 7:00 P.M. to 7:00 A.M. as nighttime.

When trying to implement this by using boolean masks as follows

nighttime = traffic[(traffic['date_time'].dt.hour >= 19)  or (traffic['date_time'].dt.hour < 7)]

Can someone guide me towards the correct condition and why my condition does not work. My code until now can be found here (https://github.com/Vivek1325/Heavy-Traffic-Indicators-on-I-94/blob/main/traffic analysis.ipynb)

CodePudding user response:

A little explanation why you'll have to use the bit-wise or operator |. Given two arrays,

import numpy as np

arr0 = np.array([1,2,3])
arr1 = np.array([4,5,6])

if you analyze the conditional operation

m = (arr0 >= 2) | (arr1 <= 5)
#-->   0 1 1    |    1 1 0
print((arr0 >= 2).shape, (arr1 <= 5).shape, m.shape)
print(m)
# (3,) (3,) (3,)
# [ True  True  True]
#-->   1 1 1

you can observe that it is actually the same as a bit-wise or, only that you're dealing with array elements instead of bits.

or cannot be used with arrays; it has to be used with scalar operands as in True or False, for instance like

m = (arr0 >= 2).any() or (arr1 <= 5).any()
print(m.shape)
print(m)
# ()
# True

Coming back to your question, you therefore have to use

traffic[(traffic['date_time'].dt.hour >= 19)  | (traffic['date_time'].dt.hour < 7)]

since both operands of the conditional are boolean arrays.

CodePudding user response:

According to the data_time column formatting you linked (e.g 2012-10-02 09:00:00), I would first create a column just with the hour:

df['hour'] = df['date_time'].str.split(' ').str[1].str.split(':').str[0].astype(int)

And then separate the dataframe into two dataframes with your condition:

nighttime = df[(df['hour'] >= 19) | (df['hour'] <= 7)]
  • Related