from datetime import datetime
import pandas as pd
import numpy as np
df = pd.read_csv('log_may.csv', converters={'ATD': lambda x: str(x),'ATA': lambda x: str(x) })
df['ATD'] = pd.to_datetime(df['ATD'], format='%H%M').dt.time
df['ATA'] = pd.to_datetime(df['ATA'], format='%H%M').dt.time
ATD ATA Trip_Time
0900 1010
CodePudding user response:
This is the approach I would use: Given a dataframe of the form:
ATD ATA
0 09:00 10:00
1 09:15 09:45
2 09:30 10:15
First convert the string data to datetime objects using:
df['ATD'] = pd.to_datetime(df['ATD'], format='%H:%M')
df['ATA'] = pd.to_datetime(df['ATA'], format='%H:%M')
Which produces a df like:
ATD ATA
0 1900-01-01 09:00:00 1900-01-01 10:00:00
1 1900-01-01 09:15:00 1900-01-01 09:45:00
2 1900-01-01 09:30:00 1900-01-01 10:15:00
Then define a function:
#function to calculate timedelta in minutes between two columns
def mins_diff(x, y):
end = x.dt.to_period('min').view(dtype='int64')
start = y.dt.to_period('min').view(dtype='int64')
return end-start
And applying the function as below:
df['Trip_Dur'] = mins_diff(df['ATA'], df["ATD"])
Yields:
ATD ATA Trip_Dur
0 1900-01-01 09:00:00 1900-01-01 10:00:00 60
1 1900-01-01 09:15:00 1900-01-01 09:45:00 30
2 1900-01-01 09:30:00 1900-01-01 10:15:00 45
CodePudding user response:
You could use a loop to do the operation with each element:
df['Trip_Time'] = '' # create the empty column (if not created yet)
for indx, row in df.iterrows():
row['Trip_Time'] = datetime.combine(datetime.min, row['ATA']) - datetime.combine(datetime.min, row['ATD'])
This should give the desired output.