Home > Back-end >  Dividing a list into subsets according to a numeric array in Python
Dividing a list into subsets according to a numeric array in Python

Time:09-21

I have a list with characters like so:

s = ['Y', 'U', 'U', 'N', 'U', 'U', 'N', 'N', 'N']

And the following array:

t = [2, 4, 3]

I would like to divide the list according to the array, such that each subset st[i] has len(t[i]). Result for this example should be:

st = [['Y', 'U'], ['U', 'N', 'U', 'U'], ['N', 'N', 'N']]

If array t was:

t = [5, 2, 2]

Then the result should be:

st = [['Y', 'U', 'U', 'N', 'U'], ['U', 'N'], ['N', 'N']]

Entries are s and t. I am trying by inserting two loops, one for list s and another one for array t. But it is not working. How can I implement this?

CodePudding user response:

You can create an iterator from s and use itertools.islice to slice the iterator according to the sizes in t:

from itertools import islice

i = iter(s)
[list(islice(i, l)) for l in t]

This returns:

[['Y', 'U'], ['U', 'N', 'U', 'U'], ['N', 'N', 'N']]

CodePudding user response:

For an input of [2, 4, 3], the starting indices would be:

  • 0
  • 0 2 = 2
  • 0 2 4 = 6

You can use itertools.accumulate() to collect the starting indices.

Once the starting indices are known, we just need to pair them via zip() with the count of items to be grouped per starting index, which already what the list [2, 4, 3] is. Thus:

  • start 0 : count 2
  • start 2 : count 4
  • start 6 : count 3

Or as @don'ttalkjustcode mentioned in the comments, we can also track the accumulated stopping indices too:

  • start (2 - 2 = 0) : stop 2
  • start (6 - 4 = 2) : stop 6
  • start (9 - 3 = 6) : stop 9
from itertools import accumulate

s = ['Y', 'U', 'U', 'N', 'U', 'U', 'N', 'N', 'N']

for t in [
    [2, 4, 3],
    [5, 2, 2],
    [1, 2, 6],
    [6, 1, 2],
    [2, 1, 4, 3],
    [2, 1, 2, 2, 1, 1],
]:
    # Option 1: Using start/count logic
    z = [s[start:start count] for start, count in zip(accumulate([0]   t), t)]

    # Option 2: Using stop/count logic (thanks to @don'ttalkjustcode for pointing this out!)
    # z = [s[stop-count:stop] for stop, count in zip(accumulate(t), t)]

    print(z)

Output

[['Y', 'U'], ['U', 'N', 'U', 'U'], ['N', 'N', 'N']]
[['Y', 'U', 'U', 'N', 'U'], ['U', 'N'], ['N', 'N']]
[['Y'], ['U', 'U'], ['N', 'U', 'U', 'N', 'N', 'N']]
[['Y', 'U', 'U', 'N', 'U', 'U'], ['N'], ['N', 'N']]
[['Y', 'U'], ['U'], ['N', 'U', 'U', 'N'], ['N', 'N']]
[['Y', 'U'], ['U'], ['N', 'U'], ['U', 'N'], ['N'], ['N']]

CodePudding user response:

You can achieve your desired result by iteratively carving chunks out of a copy of s:

i = 0
result = []
s_copy = s.copy()

while s_copy:
    result.append(s_copy[:t[i]])
    s_copy = s_copy[t[i]:]
    i  = 1

CodePudding user response:

Basically you can use:

s = ['Y', 'U', 'U', 'N', 'U', 'U', 'N', 'N', 'N']
t = [2, 4, 3]
st = []
offset = 0
for size in t:
    st.append(s[offset:size offset])
    offset  = size

print(st)
  • Related