Home > OS >  Remove duplicates without using list mutation
Remove duplicates without using list mutation

Time:10-15

I am trying to remove adjacent duplicates from a list without using list mutations like del or remove. Below is the code I tried:

def remove_dups(L):   
    L = [x for x in range(0,len(L)) if L[x] != L[x-1]]
    

    return L

print(remove_dups([1,2,2,3,3,3,4,5,1,1,1]))

This outputs:

[1, 3, 6, 7, 8]

Can anyone explain me how this output occurred? I want to understand the flow but I wasn't able to do it even with debugging in VS code.

Input:

[1,2,2,3,3,3,4,5,1,1,1]

Expected output:

[1,2,3,4,5,1]

CodePudding user response:

I'll replace the variables to make this more readable

def remove_dups(L):   
    L = [x for x in range(0,len(L)) if L[x] != L[x-1]]

becomes:

def remove_dups(lst):   
   return [index for index in range(len(lst)) if lst[index] != lst[index-1]]

You can see, instead of looping over the items of the list it is instead looping over the indices of the array comparing the value at one index lst[index] to the value at the previous index lst[index-1] and only migrating/copying the value if they don't match

The two main issues are:

  1. the first index it is compared to is -1 which is the last item of the list (compared to the first)
  2. this is actually returning the indices of the non-duplicated items.

To make this work, I'd use the enumerate function which returns the item and it's index as follows:

def remove_dups(lst):   
   return [item for index, item in enumerate(lst[:-1]) if item != lst[index 1]]   [lst[-1]]

Here what I'm doing is looping through all of the items except for the last one [:-1] and checking if the item matches the next item, only adding it if it doesn't

Finally, because the last value isn't read we append it to the output [lst[-1]].

CodePudding user response:

This is a job for itertools.groupby:

from itertools import groupby

def remove_dups(L):
    return [k for k,g in groupby(L)]

L2 = remove_dups([1,2,2,3,3,3,4,5,1,1,1])

Output: [1, 2, 3, 4, 5, 1]

  • Related