I found a case when map()
usage isn't equivalent to a list comprehension. It happens when next
used as the first argument.
For example:
l1 = [1, 2]
l2 = ['hello', 'world']
iterators = [iter(l1), iter(l2)]
# list comprehension
values1 = [next(it) for it in iterators]
# values1 = [1, "hello"]
values2 = [next(it) for it in iterators]
# values2 = [2, "world"]
values3 = [next(it) for it in iterators]
# raise StopIteration
l1 = [1, 2]
l2 = ['hello', 'world']
iterators = [iter(l1), iter(l2)]
# map
values1 = list(map(next, iterators))
# values1 = [1, "hello"]
values2 = list(map(next, iterators))
# values2 = [2, "world"]
values3 = list(map(next, iterators))
# values3 = []
# doesn't raise StopIteration
Any other exceptions occur as they should. Example:
def divide_by_zero(value: int):
return value // 0
l = [1, 2, 3]
values = list(map(divide_by_zero, l))
# raises ZeroDivisionError as expected
values = [divide_by_zero(value) for value in l]
# raises ZeroDivisionError as expected, too
It seems very strange. It works the same with Python 3.9 and Python 3.11.
It seems like map()
works like this:
def map(func, iterator):
try:
while True:
item = next(iterator)
yield func(item)
except StopIteration:
pass
but I expected it to work like this:
def map(func, iterator):
while True:
try:
item = next(iterator)
except StopIteration:
break
yield func(item)
Is it a bug?
CodePudding user response:
Try calling next
on map
:
>>> >>> m = map(next, iterators)
>>> next(m)
1
>>> next(m)
'hello'
>>> next(m)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
It's list
that sees StopIteration
and uses it to stop building the list from what map
yields.
The list comprehension, on the other hand, is building the list by iterating over iterators
, not a particular iterator in that list. That is, next(it)
is used to produce a value for the list, not to determine if we've reached the end of iterators
.
CodePudding user response:
Looks like you've found one of the rare cases where allowing StopIteration
to bubble up like an exception causes the wrong behaviour. It's not a bug, it's just an unfortunate consequence of Python's design decision to use an exception to signal the end of an iterator; it's a pitfall, like mutable default arguments, except it shows up much less frequently.
As @chepner's answer notes, the problem is that list
is catching the StopIteration
and therefore thinking the map
is exhausted, when what's actually happened is that the callback function next
raised an exception which you want to treat like a real exception, i.e. a failure condition.
To avoid this, generally speaking you should not allow next
to be called in a context where the StopIteration
can bubble up and be caught in the wrong place. If you want to pass next
as a callback, I suggest writing a wrapper around next
which converts the StopIteration
into a proper exception*:
def safe_next(it):
try:
return next(it)
except StopIteration:
raise ValueError('Unexpected end of iterator')
Usage example:
>>> l1 = [1, 2]
>>> l2 = ['hello', 'world']
>>> iterators = [iter(l1), iter(l2)]
>>> list(map(safe_next, iterators))
[1, 'hello']
>>> list(map(safe_next, iterators))
[2, 'world']
>>> list(map(safe_next, iterators))
Traceback (most recent call last):
File "<stdin>", line 3, in safe_next
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in safe_next
ValueError: Unexpected end of iterator
That said, in this use-case you should probably just use zip
:
>>> pairs = zip(l1, l2)
>>> next(pairs)
(1, 'hello')
>>> next(pairs)
(2, 'world')
>>> next(pairs)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
* StopIteration
doesn't carry a stack trace, so it's not a "proper exception" in that sense; this is another reason you don't want a StopIteration
to bubble up the call stack when the end of an iterator is a genuine failure condition.