I am creating a generator in python 3 which may yield a single value or more.
The condition that I wanted is, I want to loop with this iterator starting at the second value and so on, running an API request function with that value. If the generator yield only a single value, the for loop and corresponding code is not needed to be executed. If the generator yield more than one value, the function inside the for-loop will be executed starting from the second value of generator and so on.
The reason why I want to start at the second value is because the first value is already accessed for the API request and its result has been stored.
My question is related to a generator that produce a single value.
I give the code example below: (I simplified API Request with print() function):
def iterexample(): # creating a simple iterator that return a single value
yield 0
print(0)
iter = iterexample()
next(iter) #generator is consumed once here
for i in iter: #1 generator is exhausted
print(i, ' inside loop') #2 this is skipped because the generator is exhausted
#3 rest of the code outside the loop will be executed
It returns what I expected: only 0 is printed, not "0 inside loop"
0
My question is:
- Is it the safest and the most pythonic way to do that? will it raise any error?
- Will it produce infinite loop? I am very afraid if it will result as infinite loop of API request.
- Please review my #1 ~ #3 comment in above codes, are my understanding correct?
Thanks for the response and the help. Cheers!
CodePudding user response:
1 Is it the safest and the most pythonic way to do that? will it raise any error?
Once a generator is exhausted, it will continually raise StopIteration
exceptions when asked for new values. For loops can handle this case by terminating the loop when this exception is raised, which makes it safe to pass an exhausted generator to a for loop constructor.
However, your code calls next
directly, and is therefore only safe only if it also handle StopIteration
exceptions. In this case you would need to document that the generator provided must produce 1 or more values or be tolerant of the empty case. If the generator returned no values, then you would get an error. e.g.
def iterexample():
while False:
yield 0
print(next(iterexample()))
Traceback (most recent call last):
File "test.py", line 5, in <module>
print(next(iterexample()))
StopIteration
To prevent against empty generators you can use the second optional default argument to next
.
print(next(iterexample(), "default"))
default
2 Will it produce infinite loop? I am very afraid if it will result as infinite loop of API request.
Again this depends on the generator. Generators do not need to have an end value. You can easily define non-ending generators like this:
def iterexample():
i = 0
while True:
yield i
i = 1
for i in iterexample(): #This never ends.
print(i)
If this is a concern for you, one way to prevent never ending outputs would be to use an islice
that cuts off your generator after so many values are consumed:
from itertools import islice
for i in islice(iterexample(), 5):
print(i)
0
1
2
3
4
CodePudding user response:
If I understand correctly your issue: you have a first value that you need for a case, and the rest for another case. I would recommend building a structure that fits your needs, something like this:
class MyStructrue:
def __init__(self, initial_data):
if not initial_data:
# Make sure your data structure is valid before using it
raise ValueErro("initial_data is empty")
self.initial_data = initial_data
@property
def cached_value(self):
return self.initial_data[0]
@property
def all_but_first(self):
return self.initial_data[1:]
In this case, you make sure your data is valid, and you can give your accessors names that reflects what you those value are representing. In this example, I gave them dummy names, but you should try to make something that is relevant to your business.
Such a class could be used this way (changed names just to illustrate how method naming can document your code):
tasks = TaskQueue(get_input_in_some_way())
advance_task_status(tasks.current_task)
for pending_task in tasks.pending_tasks:
log_remaining_time(pending_tasks)
You should first try to understand what your datastructure represents and build a useful api that hide the implementation to better reflect your business.