I have a script that checks if there are one or more of the same items in a list. Here's the code:
items = ["Blue", "Black", "Red"]
def isUnique(item):
seen = list()
return not any(i in seen or seen.append(i) for i in item)
print(isUnique(items))
It prints "True" if all the items in the given list are unique and "False" if one or more items in the list are unique. Can someone please explain the any()
part of the script for me as I don't fully understand how it works?
CodePudding user response:
This code is kind of a hack, since it uses a generator expression with side-effects and exploits the fact that append
returns None
, which is falsy.
The equivalent code written in the imperative style is like so:
def isUnique(items):
seen = list()
for i in items:
if i in seen or seen.append(i):
return False
return True
The or
is still a bit strange there - it is being used for its short-circuiting behaviour, so that append
is only called when i in seen
is false - so we could rewrite it like this:
def isUnique(items):
seen = list()
for i in items:
if i in seen:
return False
else:
seen.append(i)
return True
This is equivalent because append
is only called when i in seen
is false, and the call to append
returns None
which means the return False
line shouldn't execute in that case.
CodePudding user response:
Here you need to understand first how or
operator works.
or is like exp1 or exp2
it just evaluates the expression which gives True first or give true at last
eg
>>> 2 or 3
2
>>> 5 or 0.0
5
>>> [] or 3
3
>>> 0 or {}
{}
now for your list comprehension, [i in seen or seen.append(i) for i in items]
i in seen
evaluate false and seen.append(i)
True and which return None
ie list.append
return None so , comprehension contain all None
>>> seen = []
>>> items = ["Blue", "Black", "Red"]
>>> res = [i in seen or seen.append(i) for i in items]
>>> res
[None, None, None]
>>> any(res)
False
as per any
documentation, it is returning false beacuse as it is not getting iterable or bool.
>>> help(any)
Help on built-in function any in module builtins:
any(iterable, /)
Return True if bool(x) is True for any x in the iterable.
If the iterable is empty, return False.
CodePudding user response:
the any
function in python takes a list of booleans and returns the OR of all of them.
the i in seen or seen.append(i) for i in item
appends i
to seen
if it's not in seen
already. but if it is already in seen then the append() does not run since the first part is already True, and python doesn't need to know if the second part is true since True OR'd with anything is True. so it doesn't execute it. so the seen
array ends up being a unique list of colours it has seen.
i in seen or seen.append(i) for i in item
is also a generator expression,
which generates booleans, and any
checks the booleans it generates, if even one of them evaluates to True, the whole any
will return True.
so the first time an item that is already in the seen
array is found, any will stop the generator and return True itself.
so if a duplicate element happens to be in the array no more conditions are evaluated and no more elements are appended to seen
array
so if the array had duplicate elements, like,
items = ["Blue", "Blue", "Black", "Red"]
def isUnique(item):
seen = list()
unique = not any(i in seen or seen.append(i) for i in item)
print(seen)
return unique
isUnique(items)
would result in the output, just
['Blue']
CodePudding user response:
EDIT: there are great answers. Adding some simpler ways to achieve the wanted result:
Method 1:
items = ["Blue", "Black", "Red"]
items_set = set(items)
if len(items_set) != len(items):
# there are duplications
This works because a set object ‘removes’ duplications.
Method 2:
contains_duplicates = any(items.count(element) > 1 for element in items) # true if contains duplications and false otherwise.
See https://www.kite.com/python/answers/how-to-check-for-duplicates-in-a-list-in-python ———————————————
any
is a great function
Return True if any element of the iterable is true. If the iterable is empty, return False
Your function isUnique
, however, does a bit more logic. Let's break it down:
First you create an empty list object and store it in 'seen' variable.
for i in item
- iterates the list of items.
i in seen
- This statement returns True if 'i' is a member of 'seen', and false otherwise.
seen.append(i)
- add i to seen. This statement returns None if 'i' is appeneded to seen successfully.
Notice the or
statement between i in seen or seen.append(i)
. That means, if one of the statements here is True, the or statement returns True.
At this point, I'd run [i in seen or seen.append(i) for i in item]
, see the result and experiment with it. The result for your example is [None, None, None]
.
Basically, for each item, you both add it to the list and check if it is already in the list.
Finally, you use the any()
function - which returns True if the iterable has a True
value. This will happen only if i in seen
will return True.
Notice you are using not any(...)
, which returns False in case there are no repititions.
There are simpler and clearer ways to implement this. You should try!
CodePudding user response:
It is quite simple: the expression inside any()
is a generator. any()
draws from that generator and returns True
(and stops) at the first element from the generator that is True
. If it exhausts the generator, then it returns False
.
The expression in the generator (i in seen or seen.append(i)
) is a trick to express as a one-liner the logic that: if i
is in the list, the expression is True
and any()
stops immediately, otherwise, i
is added to the list and the generator continues.
The function can be significantly improved by using a set
instead of a list
:
def isUnique(item):
seen = set()
return not any(i in seen or seen.add(i) for i in item)
It is much faster to test for presence of an item in a set
(O[1]
) than in a list
(O[n]
).
One interesting and perhaps underappreciated aspect of this code is that it works on a (potentially infinite) generator. It will stop drawing from the generator at the first repeated item. Subsequent items that would be obtained by the generator are not evaluated at all (with potential side-effects, desirable or not).
A different approach, suitable for known and finite collections of items, would be the following:
def isUnique(items):
items = tuple(items) # in case items is a generator
return len(set(items)) == len(items)
This assumes that all the items fit in memory. Obviously this won't work if items
is a generator of a very large or infinite number of elements.