Home > database >  How to split a list using marker elements?
How to split a list using marker elements?

Time:06-13

I'm trying to split a list in Python using some elements as markers. For example, consider the list:

["marker1", "elem1", "elem2", "marker2", "elem3"]

I wish to split it into 2 sublists:

[["marker1", "elem1", "elem2"], ["marker2", "elem3"]]

If the first element is not a marker, the elements before the marker shall be considered as a separate sublist:

# From:
["elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5"]
# To:
[["elem1", "elem2"], ["marker1", "elem3"], ["marker2", "elem4", "elem5"]]

It is easy to do using a regular loop:

lst = ["elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5"]

separated = []
sub_lst = []
for elem in lst:
    if elem[:6] == "marker" and sub_lst:
        separated.append(sub_lst)
        sub_lst = []
    sub_lst.append(elem)
if sub_lst:
    separated.append(sub_lst)

This code is 9 lines long. My question is how to do that in one line (or so) using list comprehension or any other functional style. Any other elegant solutions are welcome as well.

CodePudding user response:

You could look up the indexes of the marker elements in the list, and then take sublists based on those positions:

lst = ["elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5"]
idxs = [i for i, v in enumerate(lst) if type(v) == str and v.startswith('marker')]
separated = [lst[i:j] for i, j in zip([0] idxs, idxs [len(lst)]) if i < j]
# [['elem1', 'elem2'], ['marker1', 'elem3'], ['marker2', 'elem4', 'elem5']]

lst = ["marker1", "elem1", "elem2", "marker2", "elem3"]
idxs = [i for i, v in enumerate(lst) if type(v) == str and v.startswith('marker')]
separated = [lst[i:j] for i, j in zip([0] idxs, idxs [len(lst)]) if i < j]
# [['marker1', 'elem1', 'elem2'], ['marker2', 'elem3']]

Adapted from this answer.

CodePudding user response:

this is how i look like [['elem1', 'elem2'],['marker1'],['elem3'],['marker2'],['elem4', 'elem5'],['marker3'],['elem6', 'elem7']] but in a generator ofcourse

so we use zip and i as generator to compine every 2 lists together ['elem1', 'elem2'] ['marker1']

from itertools import groupby

lst = ["elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5", "marker3", "elem6", "elem7"]

i = (list(g) for _, g in groupby(lst, key=lambda x: x.startswith('marker')))

print([a   b for a, b in zip(i, i)])

Edit1

from itertools import groupby, zip_longest

lst = ["marker0", "elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5", "marker3", "elem6", "elem7"]

i = (list(g) for _, g in groupby(lst, key=lambda x: x.startswith('marker')))

print([a   b for a, b in zip_longest(i, i, fillvalue=[])])

Edit2

from itertools import groupby, zip_longest

lst = ["elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5", "marker3", "elem6", "elem7"]

i = iter([[]]   [list(g) for _, g in groupby(lst, key=lambda x: x.startswith('marker'))])

print([a   b for a, b in zip_longest(i, i, fillvalue=[])])

CodePudding user response:

Create (and include) a new inner list at the start or at a marker, otherwise append to the current inner list.

a = None
separated = [a := [x] for x in lst if not a or x.startswith('marker') or a.append(x)]

Try it online!

  • Related