Home > Mobile >  Changing datetime to hour or minute difference using pandas
Changing datetime to hour or minute difference using pandas

Time:02-13

I have a dataframe that has a column in datetime type such as:

                date  id
0   02/04/2015 02:34   1
1   06/04/2015 12:34   2
2   09/04/2015 23:03   3
3   12/04/2015 01:00   4
4   15/04/2015 07:12   5
5   21/04/2015 12:59   6
6   29/04/2015 17:33   7
7   04/05/2015 10:44   8
8   06/05/2015 11:12   9
9   10/05/2015 08:52  10
10  12/05/2015 14:19  11
11  19/05/2015 19:22  12
12  27/05/2015 22:31  13
13  01/06/2015 11:09  14
14  04/06/2015 12:57  15
15  10/06/2015 04:00  16
16  15/06/2015 03:23  17
17  19/06/2015 05:37  18
18  23/06/2015 13:41  19
19  27/06/2015 15:43  20

I would like to change the date column reference such that the first value 02/04/2015 02:34 will be 0, and the second one will be the difference between 06/04/2015 12:34 - 02/04/2015 02:34 = 6360 minutes or 106 Hours.

and so on. How can I change the column in pandas to something similar to the above explanation.

Thanks

CodePudding user response:

You could tackle this problem in 2 steps (and a last optional step):

  1. Align the values in "date" column with the target value (the one in the next row)
  2. Calculate the difference between both dates
  3. (Optional) Convert the time delta to a relevant time unit (for example, minutes)

So, how could we do this?

For the first step, we'll shift the values using the pandas.Series.shift method.

df["date2"] = df["date"].shift(1)

This will add a "date2" column with the values of the "date" column shifted by adding an empty value at the top. To fill this space, we'll use the first value again (you requested the first value to be zero and we'll use difference next so...)

df.loc[0, "date2"] = df.loc[0, "date"]

For the second step, we'll just take the difference between our columns. Because both "date" and "date2" are datetime64[ns] columns, we are able to perform this operation:

df["diff"] = df["date"] - df["date2"]

This is how your DataFrame first 5 rows looks now:

                 date  id               date2            diff
0 2015-04-02 02:34:00   1 2015-04-02 02:34:00 0 days 00:00:00
1 2015-04-06 12:34:00   2 2015-04-02 02:34:00 4 days 10:00:00
2 2015-04-09 23:03:00   3 2015-04-06 12:34:00 3 days 10:29:00
3 2015-04-12 01:00:00   4 2015-04-09 23:03:00 2 days 01:57:00
4 2015-04-15 07:12:00   5 2015-04-12 01:00:00 3 days 06:12:00

Please consider that the "diff" column has timedelta64[ns] type. If you ant to transform this unit to minutes...

df["diff"].dt.total_seconds().div(60).astype(int)

Same first 5 rows of your DataFrame with the last optional step:

                 date  id               date2  diff
0 2015-04-02 02:34:00   1 2015-04-02 02:34:00     0
1 2015-04-06 12:34:00   2 2015-04-02 02:34:00  6360
2 2015-04-09 23:03:00   3 2015-04-06 12:34:00  4949
3 2015-04-12 01:00:00   4 2015-04-09 23:03:00  2997
4 2015-04-15 07:12:00   5 2015-04-12 01:00:00  4692

CodePudding user response:

here is my solution:

import pandas as pd
df = pd.DataFrame(date, columns=['date'])
df['sec'] = pd.to_datetime(df['date']).astype('int64') // 10**9
df['diff_in_sec'] = df['sec'] - df['sec'].iloc[0]

first, convert the date column into the DateTime column then convert it to UNIX timestamp and then calculate the difference in seconds. if you want it as minutes or hours just divide diff_in_sec by 60 or 3600 like this:

df['diff_in_hour'] = df['diff_in_sec'].div(3600)

CodePudding user response:

Convert date column to datetime64 and use diff. After that, use dt accessor to floor your date by hour and get total_seconds and divide by 60 or 3600:

df['date'] = pd.to_datetime(df['date'], dayfirst=True)

df['hours'] = df['date'].diff().dt.floor('H').dt.total_seconds().div(3600).fillna(0).astype(int)
# OR
df['minutes'] = df['date'].diff().dt.floor('H').dt.total_seconds().div(60).fillna(0).astype(int)

Output:

>>> df
                  date  id  hours  minutes
0  2015-04-02 02:34:00   1      0        0
1  2015-04-06 12:34:00   2    106     6360
2  2015-04-09 23:03:00   3     82     4920
3  2015-04-12 01:00:00   4     49     2940
4  2015-04-15 07:12:00   5     78     4680
5  2015-04-21 12:59:00   6    149     8940
6  2015-04-29 17:33:00   7    196    11760
7  2015-05-04 10:44:00   8    113     6780
8  2015-05-06 11:12:00   9     48     2880
9  2015-05-10 08:52:00  10     93     5580
10 2015-05-12 14:19:00  11     53     3180
11 2015-05-19 19:22:00  12    173    10380
12 2015-05-27 22:31:00  13    195    11700
13 2015-06-01 11:09:00  14    108     6480
14 2015-06-04 12:57:00  15     73     4380
15 2015-06-10 04:00:00  16    135     8100
16 2015-06-15 03:23:00  17    119     7140
17 2015-06-19 05:37:00  18     98     5880
18 2015-06-23 13:41:00  19    104     6240
19 2015-06-27 15:43:00  20     98     5880
  • Related