Python remove element list element with same value at position-CodePudding

Let's assume I have a list, structured like this with approx 1 million elements:

a = [["a","a"],["b","a"],["c","a"],["d","a"],["a","a"],["a","a"]]

What is the fastest way to remove all elements from a that have the same value at index 0? The result should be

b = [["a","a"],["b","a"],["c","a"],["d","a"]]

Is there a faster way than this:

processed = []
no_duplicates = []

for elem in a:
    if elem[0] not in processed:
        no_duplicates.append(elem)
        processed.append(elem[0])

This works but the appending operations take ages.

CodePudding user response：

you can use set to keep the record of first element and check if for each sublist first element in this or not. it will took O(1) time compare to O(n) time to your solution to search.

>>> a = [["a","a"],["b","a"],["c","a"],["d","a"],["a","a"],["a","a"]]
>>> 
>>> seen = set()
>>> new_a = []
>>> for i in a:
...     if i[0] not in seen:
...             new_a.append(i)
...             seen.add(i[0])
... 
>>> new_a
[['a', 'a'], ['b', 'a'], ['c', 'a'], ['d', 'a']]
>>>

Space complexity : O(N) Time complexity: O(N) Search if first element there or not : O(1)

In case, no new list to be declared, then use del element, but this will increase time complexity

CodePudding user response：

You can use a dictionary comprehension using the first item as key. This will work in O(n) time.

out = list({x[0]: x for x in a}.values())

output: [['a', 'a'], ['b', 'a'], ['c', 'a'], ['d', 'a']]

CodePudding user response：

You can use a list comprehension with a condition and add the first element back to the results like so

[a[0]]   [n for n in a if n != a[0]]

Output

[['a', 'a'], ['b', 'a'], ['c', 'a'], ['d', 'a']]