There are a lot of similar question (like this one) but I did not find anything that suited my needs.
My objective is to remove groups of adjacent duplicates from a list.
For instance, if my list is
['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'C']
my desired output is
['A', 'B', 'C', 'A', 'C']
i.e. every group of adjacent duplicates is removed, only one of their group remains.
My code so far involves a for cycle with a condition:
def reduce_duplicates(l):
assert len(l) > 0, "Passed list is empty."
result = [l[0]] # initialization
for i in l:
if i != result[-1]:
result.append(i)
return result
l = ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'C']
print(reduce_duplicates(l))
# ['A', 'B', 'C', 'A', 'C']
It produces the expected output, but I think there is a native, optimized and elegant way to achieve the same result. Is it true?
CodePudding user response:
Use groupby
from itertools
:
lst = ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'C']
out = [k for k, _ in groupby(lst)]
print(out)
# Output
['A', 'B', 'C', 'A', 'C']
Update
You can also use zip_longest
from itertools
:
lst = ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'C']
out = [l for l, r in zip_longest(lst, lst[1:]) if l != r]
print(out)
# Output
['A', 'B', 'C', 'A', 'C']
Or without any imports:
lst = ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'C']
out = [lst[0]] [r for l, r in zip(lst, lst[1:]) if l != r]
print(out)
# Output
['A', 'B', 'C', 'A', 'C']
CodePudding user response:
The itertools documentation provides a recipe for exactly this, unique_justseen
. Since it uses map, it may be a tiny bit faster than the regular list comprehension, and also supports a key-function.
def unique_justseen(iterable, key=None):
"List unique elements, preserving order. Remember only the element just seen."
# unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
# unique_justseen('ABBCcAD', str.lower) --> A B C A D
return map(next, map(operator.itemgetter(1), groupby(iterable, key)))