Home > Net >  De-aggregating a table and randomizing datetime
De-aggregating a table and randomizing datetime

Time:11-02

I have a table with the following structure; the count column gets updated every time a user accesses the app again on that date.

user_id date count
1 1/1/2021 4
2 1/1/2021 7
1 1/2/2021 3
3 1/2/2021 10
2 1/3/2021 4
4 1/1/2021 12

I want to de-aggregate this data based on the count, so for example, user_id of 1 will have four records on 1/1/2021 without the count column. After that, I want to concatenate a random time to the date. My output would like this:

user_id date_time
1 1/1/2021 16:00:21
1 1/1/2021 7:23:55
1 1/1/2021 12:01:45
1 1/1/2021 21:21:07

I'm using pandas for this. Randomizing the timestamps is straightforward I think, just de-aggregating the data based on a column is a little tricky for me.

CodePudding user response:

You can duplicate the index and add a random time between 0 and 24 hours:

(df.loc[df.index.repeat(df['count'])]
   .assign(date=lambda d: pd.to_datetime(d['date'])
                          pd.to_timedelta(np.random.randint(0,24*3600, size=len(d)), unit='s'))
   .rename({'date': 'date_time'})
   .drop('count', axis=1)
)

output:

   user_id                date
0        1 2021-01-01 03:32:40
0        1 2021-01-01 03:54:18
0        1 2021-01-01 00:57:49
0        1 2021-01-01 13:04:08
1        2 2021-01-01 00:34:03
1        2 2021-01-01 00:14:17
1        2 2021-01-01 03:57:20
1        2 2021-01-01 22:01:11
1        2 2021-01-01 22:09:55
1        2 2021-01-01 13:15:36
1        2 2021-01-01 12:26:39
2        1 2021-01-02 22:51:17
2        1 2021-01-02 13:44:12
2        1 2021-01-02 01:39:14
3        3 2021-01-02 09:22:16
3        3 2021-01-02 03:34:15
3        3 2021-01-02 23:05:49
3        3 2021-01-02 02:21:35
3        3 2021-01-02 19:51:41
3        3 2021-01-02 16:02:20
3        3 2021-01-02 18:14:05
3        3 2021-01-02 09:07:14
3        3 2021-01-02 22:43:44
3        3 2021-01-02 20:48:15
4        2 2021-01-03 19:25:04
4        2 2021-01-03 14:08:03
4        2 2021-01-03 21:23:58
4        2 2021-01-03 17:24:58
5        4 2021-01-01 23:37:41
5        4 2021-01-01 06:06:17
5        4 2021-01-01 19:23:29
5        4 2021-01-01 02:12:50
5        4 2021-01-01 08:09:59
5        4 2021-01-01 03:49:30
5        4 2021-01-01 08:00:42
5        4 2021-01-01 08:03:34
5        4 2021-01-01 15:36:12
5        4 2021-01-01 14:50:43
5        4 2021-01-01 14:54:04
5        4 2021-01-01 14:58:08
  • Related