Home > Back-end >  Apply the average to a timedelta column for two group conditions
Apply the average to a timedelta column for two group conditions

Time:03-16

I have the following dataframe in Python:

ID country_ID visit_time
0 ESP 10 days 12:03:00
0 ENG 5 days 10:02:00
1 ENG 3 days 08:05:03
1 ESP 1 days 03:02:00
1 ENG 2 days 07:01:03
1 ENG 3 days 01:00:52
2 ENG 0 days 12:01:02
2 ENG 1 days 22:10:03
2 ENG 0 days 20:00:50

For each ID, I want to get:

avg_visit_ESP and avg_visit_ENG columns.

  • Average time visit with country_ID = ESP for each ID.
  • Average time visit with country_ID = ENG for each ID.
ID avg_visit_ESP avg_visit_ENG
0 10 days 12:03:00 5 days 10:02:00
1 1 days 03:02:00 (8 days 16:06:58) / 3
2 NaT (3 days 06:11:55) / 3

I don't know how to specify in groupby a double grouping, first by ID and then by country_ID. If you can help me I would appreciate it.

P.S.: The date format of visit_time (timedelta), can perform addition and division without any apparent problem.

from datetime import datetime, timedelta
    date1 = pd.to_datetime('2022-02-04 10:10:21', format='%Y-%m-%d %H:%M:%S')
    date2 = pd.to_datetime('2022-02-05 20:15:41', format='%Y-%m-%d %H:%M:%S')
    date3 = pd.to_datetime('2022-02-07 20:15:41', format='%Y-%m-%d %H:%M:%S')
    sum1date = date2-date1
    sum2date = date3-date2
    sum3date = date3-date1
    print((sum1date sum2date sum3date)/3)

CodePudding user response:

(df.groupby(['ID', 'country_ID'])['visit_time']
   .mean(numeric_only=False)
   .unstack()
   .add_prefix('avg_visit_')
)

should do the trick

>>> df = pd.read_clipboard(sep='\s\s ')
>>> df.columns = [s.strip() for s in df]
>>> df['visit_time'] = pd.to_timedelta(df['visit_time'])
>>> df.groupby(['ID', 'country_ID'])['visit_time'].mean(numeric_only=False).unstack().add_prefix('avg_visit_') 
country_ID            avg_visit_ENG    avg_visit_ESP 
ID                                                   
0                    5 days 10:02:00 10 days 12:03:00
1          2 days 21:22:19.333333333  1 days 03:02:00
2          1 days 02:03:58.333333333              NaT
  • Related