How do you deal with datetime obj when applying ANN models?-CodePudding

How do you deal with datetime obj when applying ANN models? I have thought of writing function which iterates through the column but there has to be a cleaner way to do so, right?

dataset.info()

 #   Column               Non-Null Count  Dtype         
---  ------               --------------  -----         
 0   Unnamed: 0           299 non-null    int64         
 1   ZIP                  299 non-null    int64         
 2   START_TIME           299 non-null    datetime64[ns]

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x = sc.fit_transform(x)

float() argument must be a string or a number, not 'Timestamp'

With attempt: TypeError: float() argument must be a string or a number, not 'datetime.time' in relation with a scatter plot

could not convert string to float: '2022-03-16 11:55:00'

CodePudding user response：

dataset['START_TIME'] = pd.to_datetime(dataset['START_TIME']).apply(lambda x: x.value)

Seems like a clean way of doing so, but I'm still open to alternatives.

CodePudding user response：

I would suggest doing the following steps:

converting string to datetime.datetime objects

from datetime import datetime
t = datetime.strptime("2022-03-16 11:55:00","%Y-%m-%d %H:%M:%S")

Then extract the necessary components to pass as inputs to the network:

x1,x2,x3 = t.month, t.hour, t.minute

As an aside, I noticed you are directly scaling the time components. Rather, do some different pre-processing depending on the problem. For example, extracting sine and cosine information of the time components rather than using them directly or scaling them. sine and cosine components preserve the distance between time points.

import numpy as np
hour_cos = np.cos(t.hour)
hour_sin = np.sin(t.hour)

extract other periodic components as necessary for the problem e.g. if you are looking at weather variable: sine and cosine of hour, month are typically useful. If you are looking at sales, day of month, month sine and cosine are useful

Update: from the comments I noticed you mentioned that you are predicting decibel levels. Assuming, you are already factoring in spatial inputs variables, you should definitely try something like a sine/cosine transformation assuming the events generating sounds exhibit a periodic pattern. Again, this is an assumption and might not be completely true.