Let's use a simple example: say I have a list of lists
ll = [[1, 2], [1, 3], [2, 3], [1, 2, 3], [2, 3, 4]]
and I want to find all longest lists, which means all lists that maximize the len
function. Of course we can do
def func(x):
return len(x)
maxlen = func(max(ll, key=lambda x: func(x)))
res = [l for l in ll if func(l) == maxlen]
print(res)
Output
[[1, 2, 3], [2, 3, 4]]
But I wonder if there are more efficient way to do this, especially when the function is very expensive or the list is very long. Any suggestions?
CodePudding user response:
when the function is very expensive
Note that you do compute func(x)
for each element twice, first there
maxlen = func(max(ll, key=lambda x: func(x)))
then there
res = [l for l in ll if func(l) == maxlen]
so it would be beneficial to store what was already computed. functools.lru_cache
allow that easily just replace
def func(x):
return len(x)
using
import functools
@functools.lru_cache(maxsize=None)
def func(x):
return len(x)
However, beware as due to way how data are stored argument(s) must be hashable, so in your example you would first need convert list e.g. to tuple
s i.e.
ll = [(1, 2), (1, 3), (2, 3), (1, 2, 3), (2, 3, 4)]
See descripiton in docs for further discussion
CodePudding user response:
Is not OK use dictionary
like below, (this is O(n)
)
ll = [[1, 2], [1, 3], [2, 3], [1, 2, 3], [2, 3, 4]]
from collections import defaultdict
dct = defaultdict(list)
for l in ll:
dct[len(l)].append(l)
dct[max(dct)]
Output:
[[1, 2, 3], [2, 3, 4]]
>>> dct
defaultdict(list, {2: [[1, 2], [1, 3], [2, 3]], 3: [[1, 2, 3], [2, 3, 4]]})
OR use setdefault
and without defaultdict
like below:
ll = [[1, 2], [1, 3], [2, 3], [1, 2, 3], [2, 3, 4]]
dct = {}
for l in ll:
dct.setdefault(len(l), []).append(l)
Output:
>>> dct
{2: [[1, 2], [1, 3], [2, 3]], 3: [[1, 2, 3], [2, 3, 4]]}
CodePudding user response:
From a computer science/algorithms perspective, this is a very classical "reduce" problem.
so, pseudocode. It's honestly very straightforward.
metric():= a mapping from elements to non-negative numbers
winner = []
maxmetric = 0
for element in ll:
if metric(element) larger than maxmetric:
winner = [ element ]
maxmetric = metric(element)
else if metric(element) equal to maxmetric:
append element to winner