solution = [[1,0,0],[0,1,0], [1,0,0], [1,0,0]]
I have the above nested list, which contain some other lists inside it, how do we need to get the unique lists inside the solution
output = [[1,0,0],[0,1,0]
Note: each list is of same size
Things I have tried :
- Take each list and compare with all other lists to see if its duplicated or not ? but it is very slow..
How can I check before inserting inserting list , is there any deuplicate of it so to avoid inserting duplicates
CodePudding user response:
If you don't care about the order, you can use set
:
solution = [[1,0,0],[0,1,0],[1,0,0],[1,0,0]]
output = set(map(tuple, solution))
print(output) # {(1, 0, 0), (0, 1, 0)}
CodePudding user response:
Since lists are mutable objects you can't really check identity very quickly. You could convert to tuple, however, and store the tuple-ized view of each list in a set.
Tuples are heterogenous immutable containers, unlike lists which are mutable and idiomatically homogenous.
from typing import List, Any
def de_dupe(lst: List[List[Any]]) -> List[List[Any]]:
seen = set()
output = []
for element in lst:
tup = tuple(element)
if tup in seen:
continue # we've already added this one
seen.add(tup)
output.append(element)
return output
solution = [[1,0,0],[0,1,0], [1,0,0], [1,0,0]]
assert de_dupe(solution) == [[1, 0, 0], [0, 1, 0]]
CodePudding user response:
Pandas duplicate
might be of help.
import pandas as pd
df=pd.DataFrame([[1,0,0],[0,1,0], [1,0,0], [1,0,0]])
d =df[~df.duplicated()].values.tolist()
Output
[[1, 0, 0], [0, 1, 0]]
or, since you tag multidimensional-array
, you can use numpy approach.
import numpy as np
def unique_rows(a):
a = np.ascontiguousarray(a)
unique_a = np.unique(a.view([('', a.dtype)]*a.shape[1]))
return unique_a.view(a.dtype).reshape((unique_a.shape[0], a.shape[1]))
arr=np.array([[1,0,0],[0,1,0], [1,0,0], [1,0,0]])
output=unique_rows(arr).tolist()
Based on the suggestion in this OP
CodePudding user response:
try this solution :
x=[[1,0,0],[0,1,0], [1,0,0], [1,0,0]]
Import numpy and convert the nested list into a numpy array
import numpy as np
a1=np.array(x)
find unique across rows
a2 = np.unique(a1,axis=0)
Convert it back to a nested list
a2.tolist()
Hope this helps
CodePudding user response:
while lists are not hashable and therefore inefficient to duplicate, tuples are. So one way would be to transform your list into tuples and duplicate those.
>>> solution_tuples = [(1,0,0), (0,1,0), (1,0,0), (1,0,0)]
>>> set(solution_tuples)
{(1, 0, 0), (0, 1, 0)}