Home > Mobile >  Subtracting two datetime64 columns and removing rows based on results
Subtracting two datetime64 columns and removing rows based on results

Time:10-05

I have a dataframe that looks like this

trip_id     start_date  start_station_id    end_date    end_station_id  subscription_type   journey_duration    weekday
0   913460  2019-08-31 23:26:00     50  2019-08-31 23:39:00     70  Subscriber  0 days 00:13:00     Sat
1   913459  2019-08-31 23:11:00     31  2019-08-31 23:28:00     27  Subscriber  0 days 00:17:00     Sat
2   913455  2019-08-31 23:13:00     47  2019-08-31 23:18:00     64  Subscriber  0 days 00:05:00     Sat
3   913454  2019-08-31 23:10:00     10  2019-08-31 23:17:00     8   Subscriber  0 days 00:07:00     Sat
4   913453  2019-08-31 23:09:00     51  2019-08-31 23:22:00     60  Customer    0 days 00:13:00     Sat

Essentially I used

trip_data['journey_duration'] = trip_data['end_date'] - trip_data['start_date']

to get the journey duration, now I want to remove rows where the journey duration exceeds say 36 hours

I have tried this without success

trip_data2 = trip_data[(trip_data['journey_duration'] < 1days 12:00:00) ].copy()

Any suggestions would be greatly appreciated

Thanks

CodePudding user response:

Try:

# convert to datetime:
df["start_date"] = pd.to_datetime(df["start_date"])
df["end_date"] = pd.to_datetime(df["end_date"])

# get only rows where the time difference is less than 36*60*60 seconds (36 hours): 
df_out = df[
    (df["end_date"] - df["start_date"]).dt.total_seconds() < 36 * 60 * 60
]
print(df_out)
  • Related