While loop slows down as time passes on Python-CodePudding

I'm trying to scrape some data from a website that detects live fotball odds drop and if there is a specific change in the HTML of the page,it will send me a notification to a Telegram bot that I've made..here is my code:

from distutils.command.clean import clean
import time
import requests
from bs4 import BeautifulSoup as bs
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

ids_list=[]
game_urls=[] 
game_name=[]
gfix=[]
livecapper_url ="https://livecapper.ru/bet365/" #the website link

while(True):
    page=requests.get(livecapper_url,verify=False).text
    soup = bs(page , "html.parser")
    game_ids = soup.find_all(game_id=True) #getting the IDs of every football game
    for g in game_ids:
            x=g.get('game_id')
            ids_list.append(x)   #putting the IDs on a list

    for id in ids_list:
            game_url = f"https://livecapper.ru/bet365/event.php?id={id}" #the URL of every single football game
            game_urls.append(game_url)

    for g in game_urls:
            response=requests.get(g).text
            soup = bs(response, "html.parser")
            for t in soup.find_all("td",class_=['red1','red2','red3'], limit=1): #detecting the change in HTML
                for g in soup.find_all("h1"):
                    game_name.append(g.get_text()) if g.get_text() not in game_name else game_name

    for f in game_name:
            game_url= 'https://api.telegram.org/botTOKEN/sendMessage?chat_id=-609XXXXXX&text=Fixed Alert : {}'.format(f) #sending notification to telegram bot
            if game_url not in gfix:
                gfix.append(game_url)
                requests.get(game_url)
            else:
                pass       

    ids_list.clear
    game_name.clear
    game_urls.clear
    time.sleep(1)

As you can see I'm using the While (True): method to run the code 24/7 but the problem is that each iteration lasts twice as long as the previous one approximately .

e.g. 1st iteration=10s | 2nd iteration=20s | 3rd iteration=40s | 4th iteration=80s

What can I do to make all the iterations work as fast as possible?

CodePudding user response：

Change these:

    ids_list.clear
    game_name.clear
    game_urls.clear

to:

    ids_list.clear()
    game_name.clear()
    game_urls.clear()

Without the parentheses, you aren't calling the methods, but are merely accessing them and then discarding them (i.e., it does nothing).

CodePudding user response：

There's quite a few issues with the code, but ultimately the reason it takes longer each time is you continue to append to your lists, so after each iteration that list will grow bigger and bigger (included with duplicates). There's a few things you could do:

Put those initial empty list within your loop
remove duplicates from the list so it's not requesting the same thing multiple times in each iteration
Correctly use .clear()

I simply did 1, since what it looks like you want is to start each iteration with a clear list.

from distutils.command.clean import clean
import time
import requests
from bs4 import BeautifulSoup as bs
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)


gfix=[]
livecapper_url ="https://livecapper.ru/bet365/" #the website link

while(True):
    ids_list=[]
    game_urls=[] 
    game_name=[]
    page=requests.get(livecapper_url,verify=False).text
    soup = bs(page , "html.parser")
    game_ids = soup.find_all(game_id=True) #getting the IDs of every football game
    for g in game_ids:
            x=g.get('game_id')
            ids_list.append(x)   #putting the IDs on a list

    for id in ids_list:
            game_url = f"https://livecapper.ru/bet365/event.php?id={id}" #the URL of every single football game
            game_urls.append(game_url)

    for g in game_urls:
            response=requests.get(g).text
            soup = bs(response, "html.parser")
            for t in soup.find_all("td",class_=['red1','red2','red3'], limit=1): #detecting the change in HTML
                for g in soup.find_all("h1"):
                    game_name.append(g.get_text()) if g.get_text() not in game_name else game_name

    for f in game_name:
            game_url= 'https://api.telegram.org/botTOKEN/sendMessage?chat_id=-609XXXXXX&text=Fixed Alert : {}'.format(f) #sending notification to telegram bot
            if game_url not in gfix:
                gfix.append(game_url)
                requests.get(game_url)
            else:
                pass       

    time.sleep(1)