Check average travel intervals for each station

I have the following data frame structure:

id_trip	dtm_start_trip	dtm_end_trip	start_station	end_station
1	2018-10-01 10:15:00	2018-10-01 10:17:00	100	200
2	2018-10-01 10:17:00	2018-10-01 10:18:00	200	100
3	2018-10-01 10:19:00	2018-10-01 10:34:00	100	300
4	2018-10-01 10:20:00	2018-10-01 10:22:00	300	100
5	2018-10-01 10:20:00	2018-10-01 10:29:00	400	400

And I would like to check, using python, how often a trip starts and ends in a given season. The idea was to do these average intervals per day, per hour and then in intervals of a few minutes.

What would be the best approach to doing this?

My desired output would be something to inform eg: for station 100 on 2018-10-01, a travel starts, on average, every 4 minutes

CodePudding user response：

In order to do that you could group your DataFrame by different travels. Firstly, I would make a new column with a travel id, so travels starting and ending in the same stations can be grouped. Then you can easily group those rows by travel id and get all the information you need.

Please note that your data sample does not include any "same travel". Also, consider providing a code sample for your data, it would be easier for us to work with and run tests.

CodePudding user response：

First transform the date columns into Pandas DateTime

df.dtm_start_trip = pd.to_datetime(df.dtm_start_trip)

def deltas_mean(x):
    d = x-x.shift()
    return d.mean()


df.groupby('start_station').agg({'dtm_start_trip':deltas_mean})