Home > OS >  Grouping and summing time differences from pandas dataframe
Grouping and summing time differences from pandas dataframe

Time:11-22

I have a dataframe like in example below:

Timestamp           ComponentName Utilization

18.10.2020-19:07.10        A    Available

19.10.2020-21:07.10        A    Available

19.10.2020-19:07.10        A    In use

22.10.2020-19:07.10        A    In use

25.10.2020-19:07.10        A    In use

And desired output should be:

ComponentName Total_Inuse_time Total_Available_time
    A          6 days     1 day 2 hours

Basicly I want to have total inuse time and available time for each component. I have tried grouping by component names and aggregating with sum on Time differences but could not get the desired result.

CodePudding user response:

df['Timestamp'] = pd.to_datetime(df['Timestamp'])
df['Timestamp'] = df.groupby(['ComponentName', 'Utilization'])['Timestamp'].diff().fillna(pd.Timedelta(0))
sums = df.groupby(['ComponentName', 'Utilization'])['Timestamp'].sum()

Output:

>>> sums
ComponentName  Utilization
A              Available     1 days 02:00:00
               In use        6 days 00:00:00
Name: Timestamp, dtype: timedelta64[ns]

>>> sums['A']
Utilization
Available   1 days 02:00:00
In use      6 days 00:00:00
Name: Timestamp, dtype: timedelta64[ns]

>>> sums['A']['Available']
Timedelta('1 days 02:00:00')
  • Related