This is a simplified version of a scraping script I'm using. My question is quite simple: Is there a way to modify the loop so that in the last iteration the time.sleep() is "skipped"? (I included that timer to avoid abusing api calls) After the last iteration, nothing is downloaded so the last time.sleep() is not necessary and I would like the code to go directly to the second part of the script.
import random
import math
import datetime
from datetime import datetime, timedelta
import time
user_list = ['a', 'b', 'c']
def scraper():
print('Do something')
def parser():
print('Exporting to XLSX')
# First part, downloading.
for user in user_list:
scraper()
# Sleep timer
sleep_seconds = random.randint(300*1000, 600*1000)/1000
print('Sleeping for {} seconds...'.format(sleep_seconds))
# time.sleep(sleep_seconds)
# Second part, parsing.
parser()
Output:
Do something
Sleeping for 494.028 seconds...
Do something
Sleeping for 562.442 seconds...
Do something
Sleeping for 515.752 seconds... (I want to skip this one)
Exporting to XLSX
I was thinking of doing something like:
for user in user_list:
if user == user_list[-1]:
print(user)
scraper()
else:
print(user)
scraper()
# Sleep timer
sleep_seconds = random.randint(300*1000, 600*1000)/1000
print('Sleeping for {} minutes...'.format(sleep_seconds))
# Second part, parsing.
parser()
But I'm not sure if it's the "best" way.
CodePudding user response:
a simple way to do this is to keep track of the index were we are at, with enumerate
>>> data = list(range(10))
>>> lastintex = len(data)-1
>>> for i,x in enumerate(data):
print(x)
if i<lastintex:
print("--")
0
--
1
--
2
--
3
--
4
--
5
--
6
--
7
--
8
--
9
>>>
CodePudding user response:
Sleep first and download after. And to avoid sleeping before the first occurrence, initialize the sleep to 0 and set the real sleeps after the download:
sleep_seconds = 0
for user in user_list:
print('Sleeping for {} seconds...'.format(sleep_seconds))
time.sleep(sleep_seconds)
scraper()
# Sleep timer
sleep_seconds = random.randint(300*1000, 600*1000)/1000
# Second part, parsing.
parser()
If you don't want to have the Sleeping for 0 seconds...
message, you can add a conditional:
for user in user_list:
it sleep_seconds:
print('Sleeping for {} seconds...'.format(sleep_seconds))
time.sleep(sleep_seconds)
...
CodePudding user response:
Checking for the last item of an iterable is usually cumbersome. You could move your sleep at the begining of the loop and only do it from the second pass onward:
for needSleep,user in enumerate(user_list):
if needSleep:
# Sleep timer
sleep_seconds = random.randint(300*1000, 600*1000)/1000
print('Sleeping for {} seconds...'.format(sleep_seconds))
time.sleep(sleep_seconds)
scraper()