Home > Software design >  How to calculate time elapsed since an event occurred in a specific column - Pandas DataFrames
How to calculate time elapsed since an event occurred in a specific column - Pandas DataFrames

Time:12-18

I am analyzing readings from my continuous glucose monitor as a diabetic. I have a spreadsheet where I've logged the date/time, the type of entry, and my blood glucose level. Each row represents a new entry, and entries can be of various types, for example:

  • DOSE_INSULIN (amount of units of insulin injected),
  • NEW_SENSOR (recording that I swapped out CGM sensors), or
  • TEXT (any text based information I wanted to note down to myself).

What I am trying to do is to create a new column that tracks the amount of time (in hours) elapsed between sensor swaps, so that I eventually can determine if there is a relationship between sensor age and BGL control/variability.

Here is an example table beforehand:

date type
11/21/21 12:55AM TEXT
11/21/21 1:16AM DOSE_INSULIN
11/21/21 2:05AM NEW_SENSOR
11/21/21 2:12AM DOSE_INSULIN
11/21/21 2:34AM DOSE_INSULIN
11/21/21 2:44AM NEW_SENSOR

And here is what I would like it to look like afterwards:

date type hours_since_new_sensor
11/21/21 12:55AM TEXT NaN
11/21/21 1:16AM DOSE_INSULIN NaN
11/21/21 2:05AM NEW_SENSOR 0
11/21/21 2:12AM DOSE_INSULIN 0.12
11/21/21 2:34AM DOSE_INSULIN 0.48
11/21/21 2:44AM NEW_SENSOR 0

There are a few other stackoverflow pages that I've found with similar questions, but I am having trouble adapting them to my specific issue.

  • This one creates the new columns based on two separate groups.
  • This one uses grouper, but also groups by an ID, which is confusing me.

So far, I have only been able to get to this line:

df['date'].where(df['type'] == 'NEW_SENSOR')

Which I understand will output a series(?) of null values, except for where the type is NEW_SENSOR, in which case it will output the date for that event. I can't figure out how to expand on this to get what I really want though.

Any help or advice would be greatly appreciated, thank you so much!

CodePudding user response:

Try with groupby:

df["date"] = pd.to_datetime(df["date"])
df["hours_since_new_sensor"] = df["date"] - df.groupby(df["type"].eq("NEW_SENSOR").cumsum())["date"].transform("min")
#reset the value before the first NEW_SENSOR to null
df["hours_since_new_sensor"] = df["hours_since_new_sensor"].where(df["type"].eq("NEW_SENSOR").cumsum()>0)

>>> df
                 date          type hours_since_new_sensor
0 2021-11-21 00:55:00          TEXT                    NaT
1 2021-11-21 01:16:00  DOSE_INSULIN                    NaT
2 2021-11-21 02:05:00    NEW_SENSOR        0 days 00:00:00
3 2021-11-21 02:12:00  DOSE_INSULIN        0 days 00:07:00
4 2021-11-21 02:34:00  DOSE_INSULIN        0 days 00:29:00
5 2021-11-21 02:44:00    NEW_SENSOR        0 days 00:00:00

If you would like to change the time to hours, you can do:

df["hours_since_new_sensor"] = df["hours_since_new_sensor"].dt.total_seconds().div(3600)

>>> df
                 date          type  hours_since_new_sensor
0 2021-11-21 00:55:00          TEXT                     NaN
1 2021-11-21 01:16:00  DOSE_INSULIN                     NaN
2 2021-11-21 02:05:00    NEW_SENSOR                0.000000
3 2021-11-21 02:12:00  DOSE_INSULIN                0.116667
4 2021-11-21 02:34:00  DOSE_INSULIN                0.483333
5 2021-11-21 02:44:00    NEW_SENSOR                0.000000

CodePudding user response:

df["date"] = pd.to_datetime(df["date"])

g = df['type'].eq('NEW_SENSOR').cumsum()
df['hours_since_new_sensor'] = df.groupby(g)['date'].diff().fillna(pd.Timedelta(0)).dt.total_seconds().div(60*60).groupby(g).cumsum().round(2)

Output:

>>> df
                 date          type  hours_since_new_sensor
0 2021-11-21 00:55:00          TEXT                    0.00
1 2021-11-21 01:16:00  DOSE_INSULIN                    0.35
2 2021-11-21 02:05:00    NEW_SENSOR                    0.00
3 2021-11-21 02:12:00  DOSE_INSULIN                    0.12
4 2021-11-21 02:34:00  DOSE_INSULIN                    0.48
5 2021-11-21 02:44:00    NEW_SENSOR                    0.00
  • Related