Count Amount of duplicate Sublists in list with Python-CodePudding

I've been researching for quite some time now but can't seem to find how to do this properly. I have a List that consists of a sum of 113287 sub-lists, that each hold 2 integers with 2-3 digits each.

list = [[123, 456], [111, 111], [222, 222], [333, 333], [123, 456], [222, 222], [123, 456]]

Now I want to count the amount of sub-lists, that exist more than once. Not the amount of duplicates overall, also index is irrelevant, I just want to know which combination of values exists more than once.

The result for the example should be "2", since only the sub-lists "[222, 222]" and "[123, 456]" exist more than once.

If possible and only if it doesn't overcomplicate things, I would like to do it without external libraries.

I just can't seem to figure it out, any help is appreciated.

CodePudding user response：

Use collections.Counter to count the elements, then loop over the result to keep only those that have a count greater than 1, and sum:

my_list = [[123, 456], [111, 111], [222, 222], [333, 333],
           [123, 456], [222, 222], [123, 456]]

from collections import Counter

c = Counter(map(tuple, my_list))
number = sum(v>1 for v in c.values())

output: 2

NB. you need to convert the sublists to tuples for them to be hashable and counted by Counter

CodePudding user response：

You can iterate over the set of your list. But because lists are unhashable, you'll need to convert each list in lst to a tuple. Then simply count the number of times each list in lst appears in lst:

lst = [[123, 456], [111, 111], [222, 222], [333, 333], [123, 456], [222, 222], [123, 456]]
out = sum(1 for l in set(map(tuple,lst)) if lst.count(list(l))>1)

Output:

Also if you want to count [[12,34],[34,12]] as 2, then building off of @mozway's answer, you can do:

for i, l in enumerate(my_list):
    if l[::-1] in my_list[:i]:
        my_list[i] = l[::-1]

c = Counter(map(tuple, my_list))
number = sum(v>1 for v in c.values())

CodePudding user response：

You can make a non-duplicated version of your list, then count the number of duplicated elements:

ls = [[123, 456], [111, 111], [222, 222], [333, 333], [123, 456], [222, 222], [123, 456]]
uni = []
c = 0
for l in ls:
    if not l in uni:
        uni.append(l)

for l in uni:
    if ls.count(l) != 1:
        c =1

print(c)

Output: 2

CodePudding user response：

MY_LIST = [[123, 456], [111, 111], [222, 222], [333, 333], [123, 456], [222, 222], [123, 456]]

try this:

from: Removing duplicates from a list of lists

uniques = []
for elem in MY_LIST:
    if elem not in uniques:
        uniques.append(elem)
print(uniques,'\n')

and then:

repeated = {}
for elem in uniques:
    counter = 0
    for _elem in MY_LIST:
        if elem == _elem:
            counter=counter 1
    if counter > 1:
        repeated[str(elem)] = counter

print('amount of repeated sublists: {}'.format(len(repeated)))
print(repeated)

output:

[[123, 456], [111, 111], [222, 222], [333, 333]] 

amount of repeated sublists: 2
{'[123, 456]': 3, '[222, 222]': 2}