Home > Software design >  List divide based on element from another list
List divide based on element from another list

Time:12-16

I have two lists as follows

a = [646, 650, 654, 658, 662, 666, 670, 674, 678, 682, 686, 690, 694, 698, 702, 706, 13565, 13569, 13573, 13577, 13581, 13585, 13589, 13593, 13597, 13601, 13605, 13609, 13613, 13617, 13621, 13625, 13629, 13633, 13637, 13641, 13645, 13649, 13653, 13657, 13661, 21237, 21241, 21245, 21249, 21253, 21257, 21261, 21265, 21269, 21273, 21277, 21281, 21285, 21289, 21293, 21297, 21301, 21305, 21309, 21313, 21317, 21321, 21325, 21329, 21333, 21337, 21341, 21345]

b = [646, 706, 13661, 21345]

So basically I want to break list a into smaller chunks based on start stop values from list b. E.g. Something like this

[
[646, 650, 654, 658, 662, 666, 670, 674, 678, 682, 686, 690, 694, 698, 702, 706],
[13565, 13569, 13573, 13577, 13581, 13585, 13589, 13593, 13597, 13601, 13605, 13609, 13613, 13617, 13621, 13625, 13629, 13633, 13637, 13641, 13645, 13649, 13653, 13657, 13661],
[21237, 21241, 21245, 21249, 21253, 21257, 21261, 21265, 21269, 21273, 21277, 21281, 21285, 21289, 21293, 21297, 21301, 21305, 21309, 21313, 21317, 21321, 21325, 21329, 21333, 21337, 21341, 21345]
]

Can someone please help me figure this out?

CodePudding user response:

Solution 1: use bisect

I would solve this problem by using the bisect module to find where each item in a would be inserted in b to determine which bin an item belongs to.

This solution does not require that a be sorted, but it does require that b be sorted.

bin_boundaries = sorted(b)
results = [[] for _ in range(len(bin_boundaries) 1)]
for i in a:
    pos = bisect.bisect_left(bin_boundaries, i)
    results[pos].append(i)
print(results)

Now, you did not specify whether you wanted an item that was equal to the boundary in the previous or next bin. I placed it in the previous bin. If you meant the next, replace bisect_left by bisect_right above.

I also output two more bins that your expected output shows: the first bin has items smaller than the first bin boundary, and the last bin items greater than the last boundary. Add results = results[1:-1] if you want to remove those edge bins.

Solution 2: just loop over the list for each bin

Now, here's a much simpler solution that just traverses a for each bin:

bin_boundaries = sorted(b)
results = []
for low, high in zip(bin_boundaries[:-1], bin_boundaries[1:]):
    results.append([i for i in a if i > low and i <= high])
print(results)

This time, I didn't create the edge bins. Again, fix the > and <= to match the semantics you actually want at the edges.

That outer loop can also be turned into a list comprehension, to give you this nested list comprehension and a very compact solution:

results = [
    [i for i in a if i > low and i <= high]
    for low, high in zip(bin_boundaries[:-1], bin_boundaries[1:])
]

CodePudding user response:

From your example I understand you want that the first interval includes the boundaries ([646, 706]) while the others must include only the upper boundary (]706, 13661], ]13661, 21345]).

I here use the .index method and a for loop that considers the lower boundary of the first interval and excludes it for the others:

lists_result = []

for i in range(len(b[:-1])):
    idx_inf = a.index(b[i])
    idx_sup = a.index(b[i 1])
    if i == 0:
        lists_result.append(a[idx_inf:idx_sup 1])
    else:
        lists_result.append(a[idx_inf 1:idx_sup 1])
  • Related