Given a list:
data = [71.4, 72.73, 74.36, 75.38, 76.15, 76.96, 79.51, 86.82, 87.81, 87.87, 146.38, 150.89, 151.16, 152.18, 152.36, 153.27, 155.7, 160.99, 161.36, 164.55]
how do we write a function with two variable (data, and k). data would be a 1d array of floating numbers and k would be the amount of list we intend to return.
for example using the list given above:
this is our function separate(data, 2)
should return ->
[[71.4, 72.73, 74.36, 75.38, 76.15, 76.96, 79.51, 86.82, 87.81, 87.87] , [146.38, 150.89, 151.16, 152.18, 152.36, 153.27, 155.7, 160.99, 161.36, 164.55]]
separate(data, 1)
should return ->
[[71.4, 72.73, 74.36, 75.38, 76.15, 76.96, 79.51, 86.82, 87.81, 87.87, 146.38, 150.89, 151.16, 152.18, 152.36, 153.27, 155.7, 160.99, 161.36, 164.55]]
separate(data, 3)
should return ->
[[71.4, 72.73, 74.36, 75.38, 76.15, 76.96, 79.51], [86.82, 87.81, 87.87] , [146.38, 150.89, 151.16, 152.18, 152.36, 153.27, 155.7, 160.99, 161.36, 164.55]]
CodePudding user response:
Step 1: Compute the gap sized and their positions:
gaps = [(q - p, pos) for pos, (p, q) in enumerate(pairwise(data))]
Step 2: Extract the k-largest gaps:
large_gaps = largest(k-1, gaps)
Step 3: Use the positions to split the data
The imports are:
from itertools import pairwise
from heapq import nlargest
CodePudding user response:
If data
is your initial list from the question, you can do:
N = 3
gaps = sorted(
[(i, abs(a - b)) for i, (a, b) in enumerate(zip(data, data[1:]))],
key=lambda k: -k[1],
)[: N - 1]
out, prev = [], 0
for idx, _ in sorted(gaps):
out.append(data[prev : idx 1])
prev = idx 1
if prev < len(data):
out.append(data[prev:])
print(out)
Prints:
[[71.4, 72.73, 74.36, 75.38, 76.15, 76.96, 79.51], [86.82, 87.81, 87.87], [146.38, 150.89, 151.16, 152.18, 152.36, 153.27, 155.7, 160.99, 161.36, 164.55]]
For N = 2
:
[[71.4, 72.73, 74.36, 75.38, 76.15, 76.96, 79.51, 86.82, 87.81, 87.87], [146.38, 150.89, 151.16, 152.18, 152.36, 153.27, 155.7, 160.99, 161.36, 164.55]]