Python - Compare list element if exist-CodePudding

I have a large table couple of millions of pairs of ints [[1,2],[45,101],[22,222] etc..]. What is the quickest way in Python to remove duplicates ?

Creating empty list and appending it "if not in" doesn't work since it takes ages. Converting to Numpy and use "isin" I can't seem to get it to work on pairs.

CodePudding user response：

you can do the following

arr = [[1,2],[45,101],[22,222], [1,2]]

arr = set(tuple(i) for i in arr)

if you want to convert it back to list

arr = [list(i) for i in arr]

CodePudding user response：

You could use np.unique():

np.unique([[1,2],[45,101],[22,222],[22,222]], axis=0)

Output:

array([[  1,   2],
       [ 22, 222],
       [ 45, 101]])

Note that this re-orders the list

CodePudding user response：

Probably going to be this: list(set(my_list))

Edit: Whoops. In any case, if whatever is iterating over said list can perform the task of detecting duplicates, that’d be the faster than removing duplicates beforehand.