My data looks like so:
import pandas as pd
import numpy as np
BG_test_df = pd.DataFrame(
{'PERSON_ID': [1, 1, 1],
'TS': ['2021-08-14 19:00:27', '2021-08-14 20:00:27', '2021-08-14 22:35:27'],
'bias': ["Not outside of acceptable operation. Refer to patient education","Not outside of acceptable operation. Refer to patient education","Suboptimal"]}
)
CGM_test_df = pd.DataFrame(
{'PERSON_ID': [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],
'SG': [50, 51, 52, 53, 54, 55, 400, 400, 400, 400, 400, 400, 400, 400,50, 51, 52, 53, 54, 55, 400],
'TS': ['2021-08-14 18:30:27','2021-08-14 18:35:27','2021-08-14 18:40:27','2021-08-14 18:45:27','2021-08-14 18:50:27','2021-08-14 18:55:27', '2021-08-14 19:00:27', '2021-08-14 19:30:27','2021-08-14 19:35:27','2021-08-14 19:40:27','2021-08-14 19:45:27','2021-08-14 19:50:27','2021-08-14 19:55:27','2021-08-14 20:00:27', '2021-08-14 20:30:27','2021-08-14 20:35:27','2021-08-14 20:40:27','2021-08-14 20:45:27','2021-08-14 20:50:27','2021-08-14 20:55:27','2021-08-14 21:00:27']
}
)
problematic = BG_test_df.loc[BG_test_df['bias'] == "Suboptimal"]
# Convert to datetime
problematic['BG_TS'] = pd.to_datetime(problematic['TS'])
CGM_test_df['CGM_TS'] = pd.to_datetime(CGM_test_df['TS'])
merged = CGM_test_df.merge(problematic, on = "PERSON_ID")
#resample in 5 min intervals fill in empty rows with na
filled = (merged.set_index('CGM_TS').resample('5T').sum().reset_index())
filled.replace(0, np.nan, inplace=True)
When I perform a resample to set the CGM_TS column to have 5 minute intervals, I lose my other columns. In particular, I need the BG_TS column to continue the rest of my analysis. How can I retain the BG_TS column in the filled dataset?
Thanks in advance
CodePudding user response:
You can specify a different aggregation function (e.g. min
) for BG_TS
to keep it in the result:
merged.set_index('CGM_TS').resample('5T').agg({'PERSON_ID':np.sum, 'SG':np.sum, 'BG_TS':np.min}).reset_index()
Output (for your sample data):
CGM_TS PERSON_ID SG BG_TS
0 2021-08-14 18:30:00 1 50 2021-08-14 22:35:27
1 2021-08-14 18:35:00 1 51 2021-08-14 22:35:27
2 2021-08-14 18:40:00 1 52 2021-08-14 22:35:27
3 2021-08-14 18:45:00 1 53 2021-08-14 22:35:27
4 2021-08-14 18:50:00 1 54 2021-08-14 22:35:27
5 2021-08-14 18:55:00 1 55 2021-08-14 22:35:27
6 2021-08-14 19:00:00 1 400 2021-08-14 22:35:27
7 2021-08-14 19:05:00 0 0 NaT
...
14 2021-08-14 19:40:00 1 400 2021-08-14 22:35:27
15 2021-08-14 19:45:00 1 400 2021-08-14 22:35:27
16 2021-08-14 19:50:00 1 400 2021-08-14 22:35:27
...
22 2021-08-14 20:20:00 0 0 NaT
23 2021-08-14 20:25:00 0 0 NaT
24 2021-08-14 20:30:00 1 50 2021-08-14 22:35:27
25 2021-08-14 20:35:00 1 51 2021-08-14 22:35:27
26 2021-08-14 20:40:00 1 52 2021-08-14 22:35:27
27 2021-08-14 20:45:00 1 53 2021-08-14 22:35:27
28 2021-08-14 20:50:00 1 54 2021-08-14 22:35:27
29 2021-08-14 20:55:00 1 55 2021-08-14 22:35:27
30 2021-08-14 21:00:00 1 400 2021-08-14 22:35:27