I have an ordered list, for example:
my_list = [0.1, 0.2, 0.3, 0.4, 2.6, 2.7, 2.8, 2.9,
5.1, 5.2, 6.1, 6.2, 6.3, 7.1, 7.2, 7.3,
7.4, 7.5, 10.1, 10.2, 10.3, 10.4, 10.5]
I need intervals of numbers < 1s apart, where there are at least 3 numbers. I only want the start and end of the intervals. For example:
Output: [[0.1, 0.4], [2.6, 2.9], [5.1, 7.5], [10.1, 10.5]]
or
In[0]: print(start)
Output: [0.1, 2.6, 5.1, 10.1]
In[1]: print(end)
Output: [0.4, 2.9, 7.5, 10.5]
I've tried a variety of loops, but I'm having trouble getting only the start times appended to a new list, and also having trouble avoiding "Index out of range" when getting to the end of the list. Here is where I'm at currently:
for i in range(0, (len(my_list)-2)):
second = i 1
third = i 2
if ((my_list[third] - my_list[second])) < 1 and ((my_list[second] - my_list[i]) < 1):
temp.append(my_list[i])
else:
end.append(my_list[second])
start.append(temp[0])
temp.clear()
My solution to get only the start of the interval is to append the items to a temporary list, append the first and last element and clear that list. I'm sure there is a more elegant way to do this, and the list can be thousands of rows so I don't think this is a very efficient method.
Any help would be much appreciated.
CodePudding user response:
Here's another method: numpy list comprehension:
import numpy as np
out = [[arr[0], arr[-1]] for arr in np.split(my_list, np.where(np.diff(my_list) > 1)[0] 1) if len(arr)>2]
Or if you don't want to use numpy, you can use 2 list comprehensions to find the same:
splits = [0] [idx 1 for idx, (i,j) in enumerate(zip(my_list, my_list[1:])) if j-i > 1] [len(my_list)]
out = [[my_list[start], my_list[end-1]] for start, end in zip(splits, splits[1:]) if end - start > 2]
Output:
[[0.1, 0.4], [2.6, 2.9], [5.1, 7.5], [10.1, 10.5]]
CodePudding user response:
Here's one way:
start = my_list[:1] [z[0] for z in zip(my_list[0:-1],my_list[1:]) if z[0] 1<z[1]]
end = [z[1] for z in zip(my_list[0:-1],my_list[1:]) if z[0] 1<z[1]] my_list[-1:]
CodePudding user response:
you can use stack to keep track of all the values that follow the condition, once new value come, then empty stack , check it has length more than 2 and if yes then add the first and last element from stack in the resultant list.
Here is another way:
my_list = [0.1, 0.2, 0.3, 0.4, 2.6, 2.7, 2.8, 2.9,
5.1, 5.2, 6.1, 6.2, 6.3, 7.1, 7.2, 7.3,
7.4, 7.5, 10.1, 10.2, 10.3, 10.4, 10.5]
result = []
tmp =[]
for i, v in enumerate(my_list):
if not tmp:
tmp.append(v)
else:
if abs(tmp[-1]-v)<1:
tmp.append(v)
else:
if len(tmp)>=3:
result.append([tmp[0], tmp[-1]])
tmp = [v]
if tmp and len(tmp)>=3:
result.append([tmp[0], tmp[-1]])
print(result)
# output: [[0.1, 0.4], [2.6, 2.9], [5.1, 7.5], [10.1, 10.5]]
CodePudding user response:
Using Pandas:
arr = pd.Series(my_list)
arr = arr.groupby(arr.astype(int)).nth([0,-1])
result = list(zip(arr[::2], arr[1::2])))
Or without pandas you can use itertools.groupby
using int
as your key: (Note: this assumes the list is sorted)
from itertools import groupby
my_list = [0.1, 0.2, 0.3, 0.4, 2.6, 2.7, 2.8, 2.9,
5.1, 5.2, 6.1, 6.2, 6.3, 7.1, 7.2, 7.3,
7.4, 7.5, 10.1, 10.2, 10.3, 10.4, 10.5]
result = []
for k, g in groupby(my_list, int):
group = list(g)
result.append([group[0], group[-1]])
Or as a comprehension:
result = [[f:=next(g), [f, *g][-1]] for k, g in groupby(my_list, int)]