Home > database >  How to compare between two dictionaries using threads
How to compare between two dictionaries using threads

Time:07-24

Im currently working on a comparison where I am trying to solve on how I am able to compare between two dictionaries where the first requests does a GET and scrapes the data to a dictionary and then I want to compare to for the next request using the same method and see if there has been any changes on the webpage. I have currently done:

import random
import threading
import time
from concurrent.futures import as_completed
from concurrent.futures.thread import ThreadPoolExecutor

import requests
from bs4 import BeautifulSoup

URLS = [
    'https://github.com/search?q=hello world',
    'https://github.com/search?q=python 3',
    'https://github.com/search?q=world',
    'https://github.com/search?q=i love python',
    'https://github.com/search?q=sport today',
    'https://github.com/search?q=how to code',
    'https://github.com/search?q=banana',
    'https://github.com/search?q=android vs iphone',
    'https://github.com/search?q=please help me',
    'https://github.com/search?q=batman',
]


def doRequest(url):
    response = requests.get(url)
    time.sleep(random.randint(10, 30))
    return response, url


def doScrape(response):
    soup = BeautifulSoup(response.text, 'html.parser')
    return {
        'title': soup.find("input", {"name": "q"})['value'],
        'repo_count': soup.find("span", {"data-search-type": "Repositories"}).text.strip()
    }


def checkDifference(parsed, url):


def threadPoolLoop():
    with ThreadPoolExecutor(max_workers=1) as executor:
        future_tasks = [
            executor.submit(
                doRequest,
                url
            ) for url in URLS]

        for future in as_completed(future_tasks):
            response, url = future.result()
            if response.status_code == 200:
                checkDifference(doScrape(response), url)


while True:
    t = threading.Thread(target=threadPoolLoop, )
    t.start()
    print('Joining thread and waiting for it to finish...')
    t.join()

My problem is that I do not know how I can print out whenever there has been a change for either title or/and repo_count? (The whole point will be that I will run this script 24/7 and I always want it to print out whenever there has been a change)

CodePudding user response:

If you're looking for a simple method to compare two dictionaries, there are a few different options.

Some good resources to begin:

Let's start with two dictionaries to compare

  • Related