Update list to keep first matching condition and remove subsequent items-CodePudding

I'm attempting to transform the list :

data = [['e' , 1] , ['e' , 2], ['b' , 3], ['c' , 4], ['e' , 5], ['e' , 6], ['e' , 7], ['b' , 8]]

Such that for each e (at position 0 in the sub list) until something other than 'e' is found remove the intermediary list values that contain 'e'.

So the expected output for above data is :

[['e' , 1] , ['b' , 3] , ['c' , 4] , ['e' , 5], ['b' , 8]]

I'm unsure how to implement this. I have written :

updated_list = []
for index, segment in enumerate(data) :
    if segment[0] == 'e' and data[index   1][0] == 'e' :
        updated_list.append(segment)
print(updated_list)

which produces :

[['e', 1], ['e', 5], ['e', 6]]

CodePudding user response：

You can use a for-loop with conditionals and set a variable to keep track if the last seen item contains e.

last_seen_is_e = False
updated = []
for item in data:
    if not last_seen_is_e or item[0] != "e":
        updated.append(item)
    if item[0] == "e":
        last_seen_is_e = True
    else:
        last_seen_is_e = False

[['e', 1], ['b', 3], ['c', 4], ['e', 5], ['b', 8]]

CodePudding user response：

an attempt using itertools.groupby:

from itertools import groupby

data = [['e', 1], ['e', 2], ['b', 3], ['c', 4],
        ['e', 5], ['e', 6], ['e', 7], ['b', 8]]

ret = []
for key, item in groupby(data, lambda x: x[0] == 'e'):
    if key:
        ret.append(next(item))
    else:
        ret.extend(item)

as a 'one-liner' this might be

from itertools import groupby, islice, chain

ret = list(chain.from_iterable(islice(item, 1 if key else None) for key, item in
                               groupby(data, lambda x: x[0] == 'e')))

both variants return

[['e', 1], ['b', 3], ['c', 4], ['e', 5], ['b', 8]]

the logic is the same for both: if the group starts with 'e' in the first position, only add the first element of that group. for groups that do not start with 'e' add all elements.

your approach only appends the item at index if the current and the subsequent items start with 'e'.

this might be a fix:

e_start = False
ret = []
for item in data:
    if item[0] == 'e':
        if not e_start:
            ret.append(item)
            e_start = True
    else:
        ret.append(item)
        e_start = False

e_start == True means that the last item started with 'e'.