Pandas Can't subtract date time objects? I want to create a column called Trip

from datetime import datetime
import pandas as pd
import numpy as np

df = pd.read_csv('log_may.csv', converters={'ATD': lambda x: str(x),'ATA': lambda x: str(x) })

df['ATD'] = pd.to_datetime(df['ATD'], format='%H%M').dt.time
df['ATA'] = pd.to_datetime(df['ATA'], format='%H%M').dt.time

ATD   ATA   Trip_Time

0900  1010

CodePudding user response：

This is the approach I would use: Given a dataframe of the form:

    ATD     ATA
0   09:00   10:00
1   09:15   09:45
2   09:30   10:15

First convert the string data to datetime objects using:

df['ATD'] = pd.to_datetime(df['ATD'], format='%H:%M')
df['ATA'] = pd.to_datetime(df['ATA'], format='%H:%M')

Which produces a df like:

     ATD                ATA
0   1900-01-01 09:00:00 1900-01-01 10:00:00
1   1900-01-01 09:15:00 1900-01-01 09:45:00
2   1900-01-01 09:30:00 1900-01-01 10:15:00

Then define a function:

#function to calculate timedelta in minutes between two columns
def mins_diff(x, y):
    end = x.dt.to_period('min').view(dtype='int64')
    start = y.dt.to_period('min').view(dtype='int64')
    return end-start

And applying the function as below:

df['Trip_Dur'] = mins_diff(df['ATA'], df["ATD"])

Yields:

     ATD                ATA                 Trip_Dur
0   1900-01-01 09:00:00 1900-01-01 10:00:00 60
1   1900-01-01 09:15:00 1900-01-01 09:45:00 30
2   1900-01-01 09:30:00 1900-01-01 10:15:00 45

CodePudding user response：

You could use a loop to do the operation with each element:

df['Trip_Time'] = '' # create the empty column (if not created yet)

for indx, row in df.iterrows():
    row['Trip_Time'] = datetime.combine(datetime.min, row['ATA']) - datetime.combine(datetime.min, row['ATD'])

This should give the desired output.