I am given a list of lists like this:
pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)],
[(2, 2), (2, 1)],
[(1, 1), (1, 2), (2, 2), (2, 1)]]
and the desired output is: {(2,2)}
.
I need to find the most frequent element(s). It has to return more than one value if there are elements that are repeated just as many times.
I tried solving it with an intersection of three lists, but it prints out {(2,1), (2,2)}
, instead of {(2,2)}
, since the element (2,2)
is repeated two times in the first list.
I saw a few examples with import collections
, but I don't understand them so I don't know how to change the code to be suitable for my problem.
I also tried the following:
seen = set()
repeated = set()
for l in pairs:
for i in set(l):
if i in seen:
repeated.add(i)
if i in repeated:
repeated.add(i)
else:
seen.add(i)
but still doesn't return the correct answer.
CodePudding user response:
As hinted in comments, defaultdict
is super helpful.
from collections import defaultdict
We have `pairs', but since you want the most frequent across all lists, we'll flatten it.
pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)],
[(2, 2), (2, 1)],
[(1, 1), (1, 2), (2, 2), (2, 1)]]
flat_pairs = [tpl for lst in pairs for tpl in lst]
And now we'll create a defaultdict
that defaults to zero to hold our counts, and iterate over flat_pairs to add counts to the dict.
counts = defaultdict(lambda: 0)
for tpl in flat_pairs:
counts[tpl] = 1
Or we could skip the flattening stage:
counts = defaultdict(lambda: 0)
for lst in pairs:
for tpl in lst:
counts[tpl] = 1
We can find the max count value by using max
on the values of the dict, and then a list comprehension to get the tuple or tuples with the max count.
max_count = max(counts.values())
max_count_tuples = [tpl for tpl, count in counts.items() if count == max_count]
CodePudding user response:
Alternative solution without using the Counter
method from collections
:
def get_freq_tuple(data):
counts = {}
for pairs in data:
for pair in pairs:
counts[pair] = counts.get(pair, 0) 1
return [pair for pair in counts if counts[pair] == max(counts.values())]
if __name__ == "__main__":
pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1),
(3, 1), (3, 2), (3, 3), (3, 2), (2, 2)],
[(2, 2), (2, 1)],
[(1, 1), (1, 2), (2, 2), (2, 1)]]
print(get_freq_tuple(pairs))
Output:
[(2, 2)]
Explanation:
- Count the occurrences of each tuple and store them in a dictionary. The key of the dictionary is the tuple and the value is the occurrence.
- Filter the tuples in the dictionary by maximum occurrences of the tuples.
Disclaimer:
- Using
Counter
method fromcollections
is much more efficient.
References:
CodePudding user response:
collections.Counter()
will work...you just need to figure out how to pass all the pairs in the nested list, which you can do with a list comprehension. For example:
from collections import Counter
pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)],
[(2, 2), (2, 1)],
[(1, 1), (1, 2), (2, 2), (2, 1)]]
counts = Counter(pair for l in pairs for pair in l)
counts.most_common(1)
# [((2, 2), 4)]
If you have a tie, you will need to look through the top choices and pick off the ones that have the same count. You can get the sorted list by looking at counts.most_common()
.
itertools.groupby
is a common way to deal with this. For example, if you had a tie, you could get all the top entries like:
from collections import Counter
from itertools import groupby
pairs = [[(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 1), (2, 1), (3, 1), (3, 2), (3, 3), (3, 2), (2, 2)],
[(2, 2), (2, 1)],
[(1, 1), (1, 2), (2, 2), (2, 1), (3, 2)]]
counts = Counter(pair for l in pairs for pair in l)
count, groups = next(groupby(counts.most_common(), key=lambda t: t[1]))
[g[0] for g in groups]
# [(2, 2), (3, 2)]