Home > Net >  How do i find a sequence of datetime values in a list, with at least a certain timedelta?
How do i find a sequence of datetime values in a list, with at least a certain timedelta?

Time:11-04

How do I get a sequence of datetime values out of a sorted list, which all have a difference in time of at least x seconds? The sequence should always start at the first value of the list and include the last value, which had a difference bigger than x seconds to the second last value in the sequence.

Example with x = 1 seconds

Input_Arr = np.array([datetime(1981,1,1,0,0,0,40),
                      datetime(1981,1,1,0,0,0,80),
                      datetime(1981,1,1,0,0,1,20),
                      datetime(1981,1,1,0,0,1,50),
                      datetime(1981,1,1,0,0,2,00),
                      datetime(1981,1,1,0,0,2,70),
                      datetime(1981,1,1,0,0,3,10),
                      datetime(1981,1,1,0,0,3,40),
                      datetime(1981,1,1,0,0,4,20),
                      datetime(1981,1,1,0,0,5,00)])

Output_Arr: np.array([datetime(1981,1,1,0,0,0,40),
                      datetime(1981,1,1,0,0,1,50),
                      datetime(1981,1,1,0,0,2,70),
                      datetime(1981,1,1,0,0,4,20),       

I think a combination of the following two approaches will be the solution, but I can't get the code correct. here here

The following works, but is not very clean and also not fast for bigger arrays.

Output_Arr = list()
time_delta = 1
for id, value in enumerate(Input_Arr):
    if id == 0:
        Output_Arr.append(value)
    else:
        if Output_Arr[-1]   timedelta(seconds=time_delta) < value:
        Output_Arr.append(value)

CodePudding user response:

The most straight forward way it to use a loop and compare times using timedelta.

from datetime import datetime, timedelta
from math import ceil
import numpy as np

input_arr = np.array([datetime(1981,1,1,0,0,0,40),
                      datetime(1981,1,1,0,0,0,80),
                      datetime(1981,1,1,0,0,1,20),
                      datetime(1981,1,1,0,0,1,50),
                      datetime(1981,1,1,0,0,2,00),
                      datetime(1981,1,1,0,0,2,70),
                      datetime(1981,1,1,0,0,3,10),
                      datetime(1981,1,1,0,0,3,40),
                      datetime(1981,1,1,0,0,4,20),
                      datetime(1981,1,1,0,0,5,00)])

# Pre allocate the maximal array size possible
x = timedelta(seconds=1)  # Define the minimal time delta
nb_max_output = ceil((input_arr[-1] - input_arr[0])/x)   1 # Use ceil because of numerical imprecision
output_arr = np.zeros(nb_max_output, dtype=object)  # Pre allocation
 
# Initialize variables
output_arr[0] = input_arr[0]
next_output = 1
   
for i in range(1, len(input_arr)):

    # Check time difference with the last time in output
    if input_arr[i] - output_arr[next_output-1] >= x:
         # Add to the output list
         output_arr[next_output] = input_arr[i]
         next_output  = 1

# Truncate output array
output_arr = output_arr[:next_output] 

# Show result
print(output_arr)

The output is:

>> [datetime.datetime(1981, 1, 1, 0, 0, 0, 40)
    datetime.datetime(1981, 1, 1, 0, 0, 1, 50)
    datetime.datetime(1981, 1, 1, 0, 0, 2, 70)
    datetime.datetime(1981, 1, 1, 0, 0, 4, 20)]

The trick is to use deltatime to compare your dates and pre-allocate your array because dynamical allocation is usually slow. There are probably ways to write that shorter in python and some more optimization but this is the biggest one that comes into mind.

  • Related