Home > Software design >  max() function unexpected result using key argument
max() function unexpected result using key argument

Time:12-14

names.txt:

Ben
Lukasz
Filippe
Sam
Artur

code:

full_names = (name.strip() for name in open('names.txt'))
length = ((name, len(name)) for name in full_names)
longest = max(length, key=lambda x: x[1] <= 5)
print(longest)

Result is always Ben, however, I want to get the longest name that matches condition (x <=5). Expected result: Artur

How can I modify max() to achieve this output?

If I remove the condition and set it as: max(length, key=lambda x: x[1]), it works fine.

CodePudding user response:

If you want to filter the possible values, don't do that with the key, just filter with a generator expression and then pass that to max:

max((x for x in length if x[1] <= 5), key=lambda x: x[1])
# Or without a lambda:
from operator import itemgetter
max((x for x in length if x[1] <= 5), key=itemgetter(1))

You could define the key to achieve this in a roundabout way, e.g. with:

max(length, key=lambda x: x[1] if x[1] <= 5 else -1)

so all lengths greater than the limit are treated as lengths of -1; this will misbehave when all the inputs fail the filter though (claiming a max exists when nothing passed the filter), so it's up to you if that's acceptable.

In practice, it's kinda silly to decorate with the length when it can just be cheaply computed live, so:

full_names = (name.strip() for name in open('names.txt'))
longest = max((x for x in full_names if len(x) <= 5), key=len)

is probably what I'd do.

CodePudding user response:

You need to filter full_names to keep only the ones with a length <= 5, and then find the maximum. Remember you can use the key argument of the max function to map each item to a different value. For example:

filtered_names = [name for name in full_names if len(name) <= 5]
longest = max(filtered_names, key=len)

Which gives longest = 'Artur'. Here, the len function is called for each element in filtered_names and the result of that function is used to calculate the max.

You could combine the two lines into a single line and save on one loop by using a generator instead of creating the filtered_names list

longest = max(name for name in full_names if len(name) <= 5,
              key=len)

If you really want to use the length list you have defined, you still need to filter out the elements that don't fulfill your condition, and then find the max using the length (which is at the 1 index of each element of length)

filtered_length = [item for item in length if item[1] <= 5]
longest = max(filtered_length, key=lambda item: item[1])[0]

max returns the item of length that has the largest value at item[1]. Since this item is a tuple that contains both the name and the length, we take the [0] element to get just the longest name. In one line:

longest = max(item for item in length if item[1] <= 5, key=lambda item: item[1])[0]

CodePudding user response:

The first problem is that you're not actually comparing the lengths of the names; you're comparing whether or not the name length is less than or equal to 5. Basically you're taking the maximum out of this table based on the second column:

Ben     True
Lukasz  False
Filippe False
Sam     True
Artur   True

In order to address that you need the lambda expression to return the actual length:

full_names = (name.strip() for name in open('names.txt'))
length = ((name, len(name)) for name in full_names)
longest = max(length, key=lambda x: x[1])
print(longest)

That produces the following table and picks one based on the second column (i.e, Filippe):

Ben     3
Lukasz  6
Filippe 7
Sam     3
Artur   5

If you want to only consider names with five letters or fewer, you can either filter:

longest = max(filter(lambda x: x[1] <= 5, length), key=lambda x: x[1])

Or make the lambda return zero for names with a length greater than 5:

longest = max(length, key=lambda x: x[1] if x[1] <= 5 else 0)

This last approach will produce the following and again pick based on the second column:

Ben     3
Lukasz  0
Filippe 0
Sam     3
Artur   5

CodePudding user response:

The problem

The issue you face is, that you have a misconception of what max()'s key= parameter does. It assigns a key value to each processed item by calling the given function on it an then returns the maximum element by comparing those keys. In your case it assigns:

  • Ben -> Tue
  • Lukasz -> False
  • Filippe -> False
  • Sam -> True
  • Artur -> True

Since bool values are just a special case of ints, an equivalent representation of the keys would be

  • Ben -> 1
  • Lukasz -> 0
  • Filippe -> 0
  • Sam -> 1
  • Artur -> 1

Then max() will return the element with the largest key, which is "Ben", since none of the following items are assigned a larger key (also see: stable sorting).

The solution

You want to get the longest item that has five or less chars. We can achieve this via:

longest = max(full_names, key=lambda name: (length := len(name)) * (length <= 5))

Where we are using the above fact, that bools are just ints. So each length is multiplied by True, i.e. 1 if it matches the condition and is thus unchanged. If the condition (<=5) is not met, the length will be multiplied with False, i.e. 0 and thus be zero.

CodePudding user response:

First, you should filter the sequence and then find the max value in the filtered one.

full_names = (name.strip() for name in open('names.txt'))
longest = max(filter(lambda x:len(x) <= 5, full_names), key=len)
print(longest)
  • Related