Home > other >  Grouping elements into a list
Grouping elements into a list

Time:07-22

I want to group elements into a list of list based on the indexing Starting with the first position in data, go until the next False. That's a grouping. Continue until the last element.

data = ['a','b','c','d','e','f'] 
indexer = [True, True, False, False, True, True]

Outcome would be:

[['a','b','c'], ['d'], ['e','f'] ]

Is itertools groupby the right solution? I'm a little confused about how to implement it.

CodePudding user response:

You can simply append values to a temporary list and when you reach a False, create a new temporary list, first appending the last one to the resulting list, so basically, create a list after each False, lastly if necessary append the last temporary list to the result:

data = ['a', 'b', 'c', 'd', 'e', 'f']
indexer = [True, True, False, False, True, True]

result, temp = [], []
for value, index in zip(data, indexer):
    temp.append(value)
    if not index:
        result.append(temp)
        temp = []
if temp:
    result.append(temp)

print(result)
# [['a', 'b', 'c'], ['d'], ['e', 'f']]

CodePudding user response:

Use accumulate then groupby

from itertools import groupby, accumulate
from operator import itemgetter

data = ['a', 'b', 'c', 'd', 'e', 'f']
indexer = [True, True, False, False, True, True]


groups = accumulate((not b for b in indexer), initial=0)
res = [[v for _, v in vs] for k, vs in groupby(zip(groups, data), key=itemgetter(0))]
print(res)

Output

[['a', 'b', 'c'], ['d'], ['e', 'f']]

In your particular example the variable groups is equivalent to:

[0, 0, 0, 1, 2, 2, 2]  # print(list(groups))

the idea is change the group id every time you encounter a False value, hence the need to negate it.

As an alternative you could use a variation on @Matiiss idea (all credit to him):

res = [[]]
for d, i in zip(data, indexer):
    res[-1].append(d)
    if not i:
        res.append([])

print(res)

Note: In Python you can directly sum booleans because they are integers.

CodePudding user response:

Variation of Dani's without itemgetter, instead grouping the pure group numbers and zipping with the data (iterator) later (Try it online!):

from itertools import groupby, accumulate

data = ['a', 'b', 'c', 'd', 'e', 'f']
indexer = [True, True, False, False, True, True]

data = iter(data)
groups = accumulate((not b for b in indexer), initial=0)
res = [[d for _, d in zip(vs, data)] for _, vs in groupby(groups)]
print(res)

Two more ways using that shift-the-indexer-so-we-split-before-False idea (Try it online!):

res = []
for d, i in zip(data, [False]   indexer):
    if not i:
        res.append(r := [])
    r.append(d)
res = [
    r := [d]
    for d, i in zip(data, [False]   indexer)
    if not i or r.append(d)
]
  • Related