I have a dataframe like in example below:
Timestamp ComponentName Utilization
18.10.2020-19:07.10 A Available
19.10.2020-21:07.10 A Available
19.10.2020-19:07.10 A In use
22.10.2020-19:07.10 A In use
25.10.2020-19:07.10 A In use
And desired output should be:
ComponentName Total_Inuse_time Total_Available_time
A 6 days 1 day 2 hours
Basicly I want to have total inuse time and available time for each component. I have tried grouping by component names and aggregating with sum on Time differences but could not get the desired result.
CodePudding user response:
df['Timestamp'] = pd.to_datetime(df['Timestamp'])
df['Timestamp'] = df.groupby(['ComponentName', 'Utilization'])['Timestamp'].diff().fillna(pd.Timedelta(0))
sums = df.groupby(['ComponentName', 'Utilization'])['Timestamp'].sum()
Output:
>>> sums
ComponentName Utilization
A Available 1 days 02:00:00
In use 6 days 00:00:00
Name: Timestamp, dtype: timedelta64[ns]
>>> sums['A']
Utilization
Available 1 days 02:00:00
In use 6 days 00:00:00
Name: Timestamp, dtype: timedelta64[ns]
>>> sums['A']['Available']
Timedelta('1 days 02:00:00')