I don't fully understand how the any() function in this script works-CodePudding

I have a script that checks if there are one or more of the same items in a list. Here's the code:

items = ["Blue", "Black", "Red"]

def isUnique(item):
    seen = list()
    return not any(i in seen or seen.append(i) for i in item)


print(isUnique(items))

It prints "True" if all the items in the given list are unique and "False" if one or more items in the list are unique. Can someone please explain the any() part of the script for me as I don't fully understand how it works?

CodePudding user response：

This code is kind of a hack, since it uses a generator expression with side-effects and exploits the fact that append returns None, which is falsy.

The equivalent code written in the imperative style is like so:

def isUnique(items):
    seen = list()
    for i in items:
        if i in seen or seen.append(i):
            return False
    return True

The or is still a bit strange there - it is being used for its short-circuiting behaviour, so that append is only called when i in seen is false - so we could rewrite it like this:

def isUnique(items):
    seen = list()
    for i in items:
        if i in seen:
            return False
        else:
            seen.append(i)
    return True

This is equivalent because append is only called when i in seen is false, and the call to append returns None which means the return False line shouldn't execute in that case.

CodePudding user response：

Here you need to understand first how or operator works. or is like exp1 or exp2

it just evaluates the expression which gives True first or give true at last

>>> 2 or 3
2
>>> 5 or 0.0
5
>>> [] or 3
3
>>> 0 or {}
{}

now for your list comprehension, [i in seen or seen.append(i) for i in items] i in seen evaluate false and seen.append(i) True and which return None ie list.append return None so , comprehension contain all None

>>> seen = []
>>> items = ["Blue", "Black", "Red"]
>>> res = [i in seen or seen.append(i) for i in items]
>>> res
[None, None, None]
>>> any(res)
False

as per any documentation, it is returning false beacuse as it is not getting iterable or bool.

>>> help(any)
Help on built-in function any in module builtins:

any(iterable, /)
    Return True if bool(x) is True for any x in the iterable.

    If the iterable is empty, return False.

CodePudding user response：

the any function in python takes a list of booleans and returns the OR of all of them.

the i in seen or seen.append(i) for i in item appends i to seen if it's not in seen already. but if it is already in seen then the append() does not run since the first part is already True, and python doesn't need to know if the second part is true since True OR'd with anything is True. so it doesn't execute it. so the seen array ends up being a unique list of colours it has seen.

i in seen or seen.append(i) for i in item is also a generator expression, which generates booleans, and any checks the booleans it generates, if even one of them evaluates to True, the whole any will return True.

so the first time an item that is already in the seen array is found, any will stop the generator and return True itself.

so if a duplicate element happens to be in the array no more conditions are evaluated and no more elements are appended to seen array

so if the array had duplicate elements, like,

items = ["Blue", "Blue", "Black", "Red"]


def isUnique(item):
    seen = list()
    unique = not any(i in seen or seen.append(i) for i in item)
    print(seen)
    return unique


isUnique(items)

would result in the output, just ['Blue']

CodePudding user response：

EDIT: there are great answers. Adding some simpler ways to achieve the wanted result:

Method 1:

items = ["Blue", "Black", "Red"]
items_set = set(items)
if len(items_set) != len(items):
    # there are duplications

This works because a set object ‘removes’ duplications.

Method 2:

contains_duplicates = any(items.count(element) > 1 for element in items) # true if contains duplications and false otherwise.

See https://www.kite.com/python/answers/how-to-check-for-duplicates-in-a-list-in-python ———————————————

any is a great function

Return True if any element of the iterable is true. If the iterable is empty, return False

Your function isUnique, however, does a bit more logic. Let's break it down:

First you create an empty list object and store it in 'seen' variable.

for i in item - iterates the list of items.

i in seen - This statement returns True if 'i' is a member of 'seen', and false otherwise.

seen.append(i) - add i to seen. This statement returns None if 'i' is appeneded to seen successfully.

Notice the or statement between i in seen or seen.append(i). That means, if one of the statements here is True, the or statement returns True.

At this point, I'd run [i in seen or seen.append(i) for i in item], see the result and experiment with it. The result for your example is [None, None, None].

Basically, for each item, you both add it to the list and check if it is already in the list.

Finally, you use the any() function - which returns True if the iterable has a True value. This will happen only if i in seen will return True.

Notice you are using not any(...), which returns False in case there are no repititions.

There are simpler and clearer ways to implement this. You should try!

CodePudding user response：

It is quite simple: the expression inside any() is a generator. any() draws from that generator and returns True (and stops) at the first element from the generator that is True. If it exhausts the generator, then it returns False.

The expression in the generator (i in seen or seen.append(i)) is a trick to express as a one-liner the logic that: if i is in the list, the expression is True and any() stops immediately, otherwise, i is added to the list and the generator continues.

The function can be significantly improved by using a set instead of a list:

def isUnique(item):
    seen = set()
    return not any(i in seen or seen.add(i) for i in item)

It is much faster to test for presence of an item in a set (O[1]) than in a list (O[n]).

One interesting and perhaps underappreciated aspect of this code is that it works on a (potentially infinite) generator. It will stop drawing from the generator at the first repeated item. Subsequent items that would be obtained by the generator are not evaluated at all (with potential side-effects, desirable or not).

A different approach, suitable for known and finite collections of items, would be the following:

def isUnique(items):
    items = tuple(items)  # in case items is a generator
    return len(set(items)) == len(items)

This assumes that all the items fit in memory. Obviously this won't work if items is a generator of a very large or infinite number of elements.