Home > Software engineering >  Extracting data from multiple URL's using a loop
Extracting data from multiple URL's using a loop

Time:10-01

As an amateur I have been working on a little coding project for fun. My goal is to extract some data from multiple URL's. I got to the point where I get the data I need from 1 URL (albeit pretty messy), but now I want to adjust my script so that I get data from URL's I want.

This is what my humble script currently looks like:

from bs4 import BeautifulSoup
import requests

url = "https://ktarena.com/fr/207-dofus-world-cup/match/46271/1"
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')

for KTA in soup('img'):
    KTAclass = KTA.get('title')
    print(KTAclass)
            
for KTA in soup:
    KTApoints = soup.findAll('div', class_="points")
    print(KTApoints)

So, I would need to find a way to run this script on multiple URL's and collect all that data. As you can see the URL ends with 'match/46271/1'. I need to find a way to change the number in between the dashes; the first match I would want to analyze is 46271, the last would be 46394 (so around 120 URL's to analyze).

If anyone could help me in the right direction that would be greatly appreciated!

CodePudding user response:

Try:

import requests
from bs4 import BeautifulSoup


def analyze(i):
    url = f"https://ktarena.com/fr/207-dofus-world-cup/match/{i}/1"
    page = requests.get(url)
    soup = BeautifulSoup(page.content, "html.parser")

    names = [a.text for a in soup.select(".name a")]
    points = [p.text for p in soup.select(".result .points")]
    print(url, *zip(names, points))


for i in range(46271, 46274):  # <-- increase number here
    analyze(i)

Prints:

https://ktarena.com/fr/207-dofus-world-cup/match/46271/1 ('Shadow Zoo', '0 pts') ('UndisClosed', '60 pts')
https://ktarena.com/fr/207-dofus-world-cup/match/46272/1 ('Laugh Tale', '0 pts') ('FromTheAbyss', '60 pts')
https://ktarena.com/fr/207-dofus-world-cup/match/46273/1 ('Motamawa', '0 pts') ('Espoo', '60 pts')
  • Related