[[1.0, 1.0], [1.0, 10.0], [1.0, 11.0], [1.0, 12.0], [1.0, 13.0]],
[[2.0, 1.0], [2.0, 10.0], [2.0, 11.0], [2.0, 12.0], [2.0, 13.0]],
[[3.0, 1.0], [3.0, 10.0], [3.0, 11.0], [3.0, 12.0], [3.0, 13.0]],
[[4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]],
[[4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]]]
If I have a list like this, how do I return the duplicate rows as different groups?
I don't know how to compare if two lists are same when there are sub-list in them.
For this problem, the ideal output would be:
[[[4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]],
[[4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]]]
CodePudding user response:
You want to sort your data and make each row a tuple to be able to group them. Then check if there are groups with more than 1 element -> duplicates.
Input:
ll = [[[1.0, 1.0], [1.0, 10.0], [1.0, 11.0], [1.0, 12.0], [1.0, 13.0]],
[[2.0, 1.0], [2.0, 10.0], [2.0, 11.0], [2.0, 12.0], [2.0, 13.0]],
[[3.0, 1.0], [3.0, 10.0], [3.0, 11.0], [3.0, 12.0], [3.0, 13.0]],
[[4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]],
[[4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]]]
import itertools
l = [tuple(sorted(lst)) for lst in ll]
out = []
for k, g in itertools.groupby(l):
g = list(g)
if len(g)>1:
out.append(g)
print(out)
[[([4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]),
([4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0])]]
or as one-liner:
out = list(filter(lambda s: len(s) > 1, [list(g) for k, g in itertools.groupby([tuple(sorted(lst)) for lst in ll])]))
CodePudding user response:
You can find all those nested elements whose count is greater than one and then keep them in List comprehension.
l=[[[1.0, 1.0], [1.0, 10.0], [1.0, 11.0], [1.0, 12.0], [1.0, 13.0]],
[[2.0, 1.0], [2.0, 10.0], [2.0, 11.0], [2.0, 12.0], [2.0, 13.0]],
[[3.0, 1.0], [3.0, 10.0], [3.0, 11.0], [3.0, 12.0], [3.0, 13.0]],
[[4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]],
[[4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]]]
[x for x in l if l.count(x)>1]
#output
[[[4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]],
[[4.0, 1.0], [4.0, 10.0], [4.0, 11.0], [4.0, 12.0], [4.0, 13.0]]]