Home > Blockchain >  Given a list of start and stop times, how can I identify blocks of time that are accounted for acros
Given a list of start and stop times, how can I identify blocks of time that are accounted for acros

Time:01-05

I have 3 sensors logging data on objects at time intervals. I have it stored in a pandas dataframe that looks like this:

DataFrame

The data in the Sensor time columns represents [start time, stop time] and they are formatted as lists. The index is object.

I want to add a column or create a new dataframe that captures all windows in time where an object was tracked by at least one sensor. For instance, the first object is tracked from [0,12] and [14,20]. I know how to do it with a series of loops, but I am wondering if there is a more efficient way to do this?

CodePudding user response:

Let's start by creating the dataframe, df:

import pandas as pd 

lst = [[1,[0,5],[5,12],[14,20]], 
       [2,[5,30],[5,30],[5,35]],
       [3,[30,40],[45,55],[1,20]]]
    
df = pd.DataFrame(lst, columns =['Object', 'Sensor1Time', 'Sensor2Time', 'Sensor3Time'])
print(df)

enter image description here

Next, we define a function called overlap, which calculates the overlap between two given lists:

def overlap (list1,list2):
    if min([list1[1], list2[1]]) >= max([list1[0], list2[0]]):
        result = [min([list1[0], list2[0]]), max([list1[1], list2[1]])]
    else: 
        result = list1 ,list2  
    return result

let's call this function of two lists to see how it works:

overlap([0, 4],[2, 8])

enter image description here

In general, two intervals are overlapping if:

min([upperBoundOfA, upperBoundOfB]) >= max([lowerBoundOfA, lowerBoundOfB])

If this is the case, the union of those intervals is:

(min([lowerBoundOfA, lowerBoundOfB]), max([upperBoundOfA, upperBoundOfB])

taken from here enter image description here

CodePudding user response:

A possible solution:

def get_list(l1, l2, l3):
    l = sorted([l1, l2, l3], key=lambda x: x[0])
    
    extent = []
    cur_extent = l[0]
    
    for interval in [l[1], l[2]]:
        if interval[0] >= max(cur_extent):
            extent.append(cur_extent)
            cur_extent = interval
        else:
            cur_extent[1] = max(cur_extent[1], interval[1])
        
    extent.append(cur_extent)
    
    return extent

df['Total'] = df.apply(lambda x: get_list(x[1], x[2], x[3]), axis=1)

Output:

   Object Sensor1 Time Sensor2 Time Sensor3 Time  \
0       1      [0, 10]      [5, 12]     [14, 20]   
1       2      [5, 30]      [5, 30]      [5, 35]   
2       3     [30, 40]     [45, 55]      [1, 20]   

                           Total  
0            [[0, 12], [14, 20]]  
1                      [[5, 35]]  
2  [[1, 20], [30, 40], [45, 55]]  
  • Related