Use Counter to count unique values in multiple lists-CodePudding

I have a list of lists and I'm trying to use Counter to get the number of unique words across all of the lists.

[[‘My’,
‘name’,
 ‘is’,
 ‘Joe’],
[‘My’,
 ‘name’,
 ‘is’,
 ‘Sally’],
[‘My’,
 ‘name’,
 ‘is’,
 ‘Mike’]]

If it were just the first list I think I could do this:

counter_object = Counter(my_list[0])
keys = counter_object.keys()
num_values = len(keys)

print(num_values)

But unsure about doing this for multiple. Any help is much appreciated, thanks.

Edit: The expected output is 6. Because unique words ‘My’, ‘name’, ‘is’, 'Joe', 'Sally', 'Mike' total to 6.

CodePudding user response：

If I understand your question correctly, you want to count the unique items from each sublists.

# MM = is your list
from collections import Counter   

def count_unique(M):
    flats = [x for sub in M for x in sub]
   
    counts = Counter(flats)

    return len(counts.keys())



print(count_unique(MM))     # check it 
# 6

CodePudding user response：

Use chain.from_iterable to flatten the list, then use Counter:

from collections import Counter
from itertools import chain

data = [
    ["My", "name", "is", "Joe"],
    ["My", "name", "is", "Sally"],
    ["My", "name", "is", "Mike"],
]

counts = Counter(chain.from_iterable(data))
print(counts)

Output

Counter({'My': 3, 'name': 3, 'is': 3, 'Joe': 1, 'Sally': 1, 'Mike': 1})

For more on how to flatten lists of lists, see the this.

If you want the total of unique keys, on top of the counts, just do:

res = len(counts)

Note that if you only care about the total of uniques, you can directly use a set:

counts = set(chain.from_iterable(data))
print(counts)

Output

{'Sally', 'Mike', 'My', 'name', 'is', 'Joe'}