Home > database >  Loop function over list of values until list null
Loop function over list of values until list null

Time:10-04

I need to create a web-scraping program where I pass a value from a list (i.e. an integer indicating the number of clicks) to a function, but if the function doesn' t succeed I need it to store this value and then re-run the function with these unsuccess values until they all succeed (or at least for a n number of trials). I only have a pseudo-code of what I'm thinking, since I'm not sure how to do this:

first_ind = [1,2,3,...]
error_ind = []

#here there may be a loop for n trials

for i in first_ind:
   try:
      some_scrape_function(i)  #returning some list of success values
   except:
      error_ind.append(i)

#here I don' t know how to re-run the function over a list that is at every iteration potentially smaller, 
#until potentially null.
while new_error_ind:
   new_error_ind = []
   for i in error_ind:
      try:
         some_scrape_function(i)
            return success_list_i
      except:
         new_error_ind.append(i)

In this last part, how can I make sure function is re-runned until success is obtained for all values?

CodePudding user response:

Keep a list of failures and keep looping until they pass or the retries are exhausted:

import random

def simulate_scrape(n): # fail 20% of the time
    if random.random() >= .8:
        raise RuntimeError('failed')
    return True

def do_scrape(indexes, tries):
    success = []
    unsuccessful = indexes.copy() # don't mutate passed-in list

    # empty containers and zero values are treated as false in Python,
    # so if there are items in the list and tries is not zero, process...
    while unsuccessful and tries:
        failed = []
        for n in unsuccessful:
            try:
                simulate_scrape(n) # exception on failure as OP was 
            except RuntimeError:
                failed.append(n)
            else:
                success.append(n)
        unsuccessful = failed  # transfer the failed list for next round
        print(f'failed: {unsuccessful}')
        tries -= 1

    return success,unsuccessful

to_do = list(range(50)) # all indexes 0-49 start unsuccessful
passed,failed = do_scrape(to_do, 3)
print(f'FINAL {passed=}\n'
      f'      {failed=}\n')

Output sample runs:

failed: [3, 10, 13, 16, 21, 24, 27, 31]
failed: [3, 27]
failed: []
FINAL passed=[0, 1, 2, 4, 5, 6, 7, 8, 9, 11, 12, 14, 15, 17, 18, 19, 20, 22, 23, 25, 26, 28, 29, 30, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 10, 13, 16, 21, 24, 31, 3, 27]
      failed=[]

failed: [0, 5, 6, 38, 40, 49]
failed: [0]
failed: []
FINAL passed=[1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 39, 41, 42, 43, 44, 45, 46, 47, 48, 5, 6, 38, 40, 49, 0]
      failed=[]

failed: [2, 5, 9, 10, 14, 20, 23, 28, 49]
failed: [9, 10, 14, 23]
failed: [9]
FINAL passed=[0, 1, 3, 4, 6, 7, 8, 11, 12, 13, 15, 16, 17, 18, 19, 21, 22, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 2, 5, 20, 28, 49, 10, 14, 23]
      failed=[9]
  • Related