I have a dataframe which has two datetime
columns. I am doing difference of those two datetime
columns to find number of days between them. The problem is the time in both of them starts from 06:00:00
so when the actual day gets over, the difference remains the same.
Here is the dataframe
machineID datetime_tel comp1 datetime_maint sincelastComp1
30 2021-01-01 23:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 00:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 01:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 02:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 03:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 04:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 05:00:00 1 2020-07-31 06:00:00 154
30 2021-01-02 06:00:00 1 2020-07-31 06:00:00 155
30 2021-01-02 07:00:00 1 2020-07-31 06:00:00 155
Here you can see when we have date as 2021-01-02
and time as 00:00:00
, there is no increment in sincelastComp1
(ideally it should have been incremented as date as increased by 1). It is only increasing when time is 06:00:00
.
I want the increment to happen when the date changes.
CodePudding user response:
if I understand you right, you want to calculate the difference between dates and not between timestamps:
df['sincelastComp1'] = (df['datetime_tel'].apply(pd.Timestamp).dt.date -
df['datetime_maint'].apply(pd.Timestamp).dt.date).dt.days
output:
machineID datetime_tel comp1 datetime_maint sincelastComp1
0 30 2021-01-01 23:00:00 1 2020-07-31 06:00:00 154
1 30 2021-01-02 00:00:00 1 2020-07-31 06:00:00 155
2 30 2021-01-02 01:00:00 1 2020-07-31 06:00:00 155
3 30 2021-01-02 02:00:00 1 2020-07-31 06:00:00 155
4 30 2021-01-02 03:00:00 1 2020-07-31 06:00:00 155
5 30 2021-01-02 04:00:00 1 2020-07-31 06:00:00 155
6 30 2021-01-02 05:00:00 1 2020-07-31 06:00:00 155
7 30 2021-01-02 06:00:00 1 2020-07-31 06:00:00 155
8 30 2021-01-02 07:00:00 1 2020-07-31 06:00:00 155