I'm trying to calculate the time difference between two times, I'm really struggling using datetime
because I only have a time value, not also a date value and when I write something like t1 = datetime.strptime(item[1], '%H:%M:%S')
with, for example, item[1] = "00:01:34"
and then print t1
I got something like t1: 1900-01-01 00:01:34
.
When I calculate timedelta = t2 - t1
and then I try to print the results, it doesn't print anything because it goes into a loop. I already tried using time
/ .time()
but it gives me a TypeError: unsupported operand type(s) for -: 'builtin_function_or_method' and 'builtin_function_or_method'
. If I point on timedelta = (t2 - t1).seconds
it reports Class 'time' does not define 'sub', so the '-' operator cannot be used on its instances
and if I run the code it gives me the TypeError
even if I use .time()
.
For clarity, my code is this:
listaDivisa = [
['2010-01-05', '12:32:05', 'at the kitchen entrance from the dining room', 'ON']
['2010-01-05', '12:32:05', 'in the kitchen', 'ON']
['2010-01-05', '12:32:08', 'in the living room', 'ON']
['2010-01-05', '12:32:08', 'in the kitchen', 'OFF']
['2010-01-05', '12:32:10', 'at the kitchen entrance from the dining room', 'OFF']
['2010-01-05', '12:32:10', 'in the kitchen', 'ON']
['2010-01-05', '12:32:11', 'in the kitchen', 'OFF']
['2010-01-05', '12:32:11', 'in the living room', 'OFF']
['2010-01-06', '02:32:11', 'in the kitchen', 'ON']
['2010-01-06', '02:32:20', 'in the kitchen', 'OFF']
['2010-01-06', '02:34:23', 'in the living room', 'ON']
['2010-01-06', '02:34:42', 'in the living room', 'OFF']]
# this list contains approximately 3000 of this activities, obviously I put only
# a few just for example
listaDict = {}
for p in listaDivisa:
if p[2] not in listaDict.keys():
listaDict[p[2]] = dict()
for i, item in enumerate(listaDivisa):
for j in range(i 1, len(listaDivisa) - 1):
if item[0] == listaDivisa[j][0]:
if item[2] == listaDivisa[j][2]:
if item[3] == "ON" and listaDivisa[j][3] == "OFF":
t1 = datetime.strptime(item[1], '%H:%M:%S')
t2 = datetime.strptime(listaDivisa[j][1], '%H:%M:%S')
timedelta = (t2 - t1).seconds
listaDict[item[2]][item[
0]] = "active for " str(
timedelta) " seconds"
for key, value in listaDict.items():
print(key, ' : ', value)
The result I'm trying to achive is a nested dictionary like this:
in the kitchen:['2009-10-16': 'active for 2341 seconds', '2009-10-17': 'active for 0 seconds' ..... ]
in the living room:['2009-10-16': 'active for 20 seconds', '2009-10-17': 'active for 3 seconds' ..... ]
at the kitchen entrance from the dining room:['2009-10-16': 'active for 6473 seconds', '2009-10-17': 'active for 374 seconds' ... ]
CodePudding user response:
Here's a solution to your problem using pandas
. I trust other users will find ways to refactor some of the below to improve it, but I believe it does the job. I've included comments to explain what is happening at each stage. Let me know if anything is unclear, or if the result is not what you intended.
import pandas as pd
listaDivisa = [
['2010-01-05', '12:32:05', 'at the kitchen entrance from the dining room', 'ON'],
['2010-01-05', '12:32:05', 'in the kitchen', 'ON'],
['2010-01-05', '12:32:08', 'in the living room', 'ON'],
['2010-01-05', '12:32:08', 'in the kitchen', 'OFF'],
['2010-01-05', '12:32:10', 'at the kitchen entrance from the dining room', 'OFF'],
['2010-01-05', '12:32:10', 'in the kitchen', 'ON'],
['2010-01-05', '12:32:11', 'in the kitchen', 'OFF'],
['2010-01-05', '12:32:11', 'in the living room', 'OFF'],
['2010-01-06', '02:32:11', 'in the kitchen', 'ON'],
['2010-01-06', '02:32:20', 'in the kitchen', 'OFF'],
['2010-01-06', '02:34:23', 'in the living room', 'ON'],
['2010-01-06', '02:34:42', 'in the living room', 'OFF']]
# 1) list of lists to df
df = pd.DataFrame(listaDivisa, columns=['Date','Time','Activity','Status'])
# 2) add datetime col
df['Datetime'] = pd.to_datetime(df.Date ' ' df.Time)
# 3) groupby date & activity, calc diff between datetime vals (= timedeltas)
# and convert those timedeltas to seconds
df['Timedelta'] = df.groupby(['Date','Activity'])['Datetime'].transform('diff').dt.total_seconds()
# 4) create new df just on df.Status == 'OFF', these will be the correct timedeltas
new_df = df.loc[df.Status=='OFF',['Date','Activity','Timedelta']]
# =============================================================================
# new_df at this stage:
#
# Date Activity Timedelta
# 3 2010-01-05 in the kitchen 3.0
# 4 2010-01-05 at the kitchen entrance from the dining room 5.0
# 6 2010-01-05 in the kitchen 1.0
# 7 2010-01-05 in the living room 3.0
# 9 2010-01-06 in the kitchen 9.0
# 11 2010-01-06 in the living room 19.0
# =============================================================================
# 5) now groupby to get sum per activity & date and reset the index
# e.g. on '2010-01-05' we have 2 timedeltas for 'in the kitchen': 3 sec 1 sec = 4
new_df = new_df.groupby(['Activity','Date'])['Timedelta'].sum().reset_index()
# =============================================================================
# e.g. new_df now looks like:
# Activity Date Timedelta
# 0 at the kitchen entrance from the dining room 2010-01-05 5.0
# 1 in the kitchen 2010-01-05 4.0
# 2 in the kitchen 2010-01-06 9.0
# 3 in the living room 2010-01-05 3.0
# 4 in the living room 2010-01-06 19.0
# =============================================================================
# 6) turn df.Timedelta into strings
new_df['Timedelta'] = new_df['Timedelta'].apply(lambda x: f'active for {int(x)} seconds')
# 7) set df.Activity as index
new_df.set_index('Activity', drop=True, inplace=True)
# 8) define a dictionary and iter over rows to populate it
a_dict = {}
for row, data in new_df.iterrows():
# if row already exists in dict, we want to *add* a key:value pair
if row in a_dict:
a_dict[row].update({ data[0]: data[1]})
# else, we want to create a new key, and set its init value to key:value pair
else:
a_dict[row] = { data[0]: data[1] }
a_dict
{'at the kitchen entrance from the dining room': {'2010-01-05': 'active for 5 seconds'},
'in the kitchen': {'2010-01-05': 'active for 4 seconds',
'2010-01-06': 'active for 9 seconds'},
'in the living room': {'2010-01-05': 'active for 3 seconds',
'2010-01-06': 'active for 19 seconds'}}
So, now you can do something like:
a_dict['in the kitchen']['2010-01-05']
'active for 4 seconds'
# As mentioned: we have 2 entries for 'in the kitchen' on '2010-01-05'
# summed to 4 seconds'