Grouping elements into a list-CodePudding

I want to group elements into a list of list based on the indexing Starting with the first position in data, go until the next False. That's a grouping. Continue until the last element.

data = ['a','b','c','d','e','f'] 
indexer = [True, True, False, False, True, True]

Outcome would be:

[['a','b','c'], ['d'], ['e','f'] ]

Is itertools groupby the right solution? I'm a little confused about how to implement it.

CodePudding user response：

You can simply append values to a temporary list and when you reach a False, create a new temporary list, first appending the last one to the resulting list, so basically, create a list after each False, lastly if necessary append the last temporary list to the result:

data = ['a', 'b', 'c', 'd', 'e', 'f']
indexer = [True, True, False, False, True, True]

result, temp = [], []
for value, index in zip(data, indexer):
    temp.append(value)
    if not index:
        result.append(temp)
        temp = []
if temp:
    result.append(temp)

print(result)
# [['a', 'b', 'c'], ['d'], ['e', 'f']]

CodePudding user response：

Use accumulate then groupby

from itertools import groupby, accumulate
from operator import itemgetter

data = ['a', 'b', 'c', 'd', 'e', 'f']
indexer = [True, True, False, False, True, True]


groups = accumulate((not b for b in indexer), initial=0)
res = [[v for _, v in vs] for k, vs in groupby(zip(groups, data), key=itemgetter(0))]
print(res)

Output

[['a', 'b', 'c'], ['d'], ['e', 'f']]

In your particular example the variable groups is equivalent to:

[0, 0, 0, 1, 2, 2, 2]  # print(list(groups))

the idea is change the group id every time you encounter a False value, hence the need to negate it.

As an alternative you could use a variation on @Matiiss idea (all credit to him):

res = [[]]
for d, i in zip(data, indexer):
    res[-1].append(d)
    if not i:
        res.append([])

print(res)

Note: In Python you can directly sum booleans because they are integers.

CodePudding user response：

Variation of Dani's without itemgetter, instead grouping the pure group numbers and zipping with the data (iterator) later (Try it online!):

from itertools import groupby, accumulate

data = ['a', 'b', 'c', 'd', 'e', 'f']
indexer = [True, True, False, False, True, True]

data = iter(data)
groups = accumulate((not b for b in indexer), initial=0)
res = [[d for _, d in zip(vs, data)] for _, vs in groupby(groups)]
print(res)

Two more ways using that shift-the-indexer-so-we-split-before-False idea (Try it online!):

res = []
for d, i in zip(data, [False]   indexer):
    if not i:
        res.append(r := [])
    r.append(d)

res = [
    r := [d]
    for d, i in zip(data, [False]   indexer)
    if not i or r.append(d)
]