I'm trying to split a list in Python using some elements as markers. For example, consider the list:
["marker1", "elem1", "elem2", "marker2", "elem3"]
I wish to split it into 2 sublists:
[["marker1", "elem1", "elem2"], ["marker2", "elem3"]]
If the first element is not a marker, the elements before the marker shall be considered as a separate sublist:
# From:
["elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5"]
# To:
[["elem1", "elem2"], ["marker1", "elem3"], ["marker2", "elem4", "elem5"]]
It is easy to do using a regular loop:
lst = ["elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5"]
separated = []
sub_lst = []
for elem in lst:
if elem[:6] == "marker" and sub_lst:
separated.append(sub_lst)
sub_lst = []
sub_lst.append(elem)
if sub_lst:
separated.append(sub_lst)
This code is 9 lines long. My question is how to do that in one line (or so) using list comprehension or any other functional style. Any other elegant solutions are welcome as well.
CodePudding user response:
You could look up the indexes of the marker
elements in the list, and then take sublists based on those positions:
lst = ["elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5"]
idxs = [i for i, v in enumerate(lst) if type(v) == str and v.startswith('marker')]
separated = [lst[i:j] for i, j in zip([0] idxs, idxs [len(lst)]) if i < j]
# [['elem1', 'elem2'], ['marker1', 'elem3'], ['marker2', 'elem4', 'elem5']]
lst = ["marker1", "elem1", "elem2", "marker2", "elem3"]
idxs = [i for i, v in enumerate(lst) if type(v) == str and v.startswith('marker')]
separated = [lst[i:j] for i, j in zip([0] idxs, idxs [len(lst)]) if i < j]
# [['marker1', 'elem1', 'elem2'], ['marker2', 'elem3']]
Adapted from this answer.
CodePudding user response:
this is how i
look like [['elem1', 'elem2'],['marker1'],['elem3'],['marker2'],['elem4', 'elem5'],['marker3'],['elem6', 'elem7']]
but in a generator ofcourse
so we use zip
and i
as generator
to compine every 2 lists together ['elem1', 'elem2'] ['marker1']
from itertools import groupby
lst = ["elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5", "marker3", "elem6", "elem7"]
i = (list(g) for _, g in groupby(lst, key=lambda x: x.startswith('marker')))
print([a b for a, b in zip(i, i)])
Edit1
from itertools import groupby, zip_longest
lst = ["marker0", "elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5", "marker3", "elem6", "elem7"]
i = (list(g) for _, g in groupby(lst, key=lambda x: x.startswith('marker')))
print([a b for a, b in zip_longest(i, i, fillvalue=[])])
Edit2
from itertools import groupby, zip_longest
lst = ["elem1", "elem2", "marker1", "elem3", "marker2", "elem4", "elem5", "marker3", "elem6", "elem7"]
i = iter([[]] [list(g) for _, g in groupby(lst, key=lambda x: x.startswith('marker'))])
print([a b for a, b in zip_longest(i, i, fillvalue=[])])
CodePudding user response:
Create (and include) a new inner list at the start or at a marker, otherwise append to the current inner list.
a = None
separated = [a := [x] for x in lst if not a or x.startswith('marker') or a.append(x)]