Home > Net >  How can I only parse the first HTML block from multiple blocks, if they all contain the same class-n
How can I only parse the first HTML block from multiple blocks, if they all contain the same class-n

Time:10-12

I need to parse info from a site, on this site, there are 2 blocks, "Today" and "Yesterday", and they have the same class name of standard-box standard-list. How can I only parse the first block (under "Today") in a row, without extracting the inform from "Yesterday", if they both contain the same class-name?

Here is my code:

import requests


url_news = "https://www.123.org/"
response = requests.get(url_news)
soup = BeautifulSoup(response.content, "html.parser")
items = soup.findAll("div", class_="standard-box standard-list")
news_info = []
for item in items:
    news_info.append({
        "title": item.find("div", class_="newstext",).text,
        "link": item.find("a", class_="newsline article").get("href")
    })

CodePudding user response:

When running your provided code, I don't get an output for items. However, you said that you do, so:

If you only want to get the data under "Today", you can use .find() instead of .find_all(), since .find() will only return the first found tag -- which is "Today" and not the other tags.

So, instead of:

items = soup.findAll("div", class_="standard-box standard-list")

Use:

items = soup.find("div", class_="standard-box standard-list")

Additionally, to find the link, I needed to access the attribute using tag-name[attribute]. Here is working code:

news_info = []
items = soup.find("div", class_="standard-box standard-list")
for item in items:
    news_info.append(
        {"title": item.find("div", class_="newstext").text, "link": item["href"]}
    )

print(news_info)

Output:

[{'title': 'NIP crack top 3 ranking for the first time in 5 years', 'link': 'https://www.hltv.org/news/32545/nip-crack-top-3-ranking-for-the-first-time-in-5-years'}, {'title': 'Fessor joins Astralis Talent', 'link': 'https://www.hltv.org/news/32544/fessor-joins-astralis-talent'}, {'title': 'Grashog joins AGO', 'link': 'https://www.hltv.org/news/32542/grashog-joins-ago'}, {'title': 'ISSAA parts ways with Eternal Fire', 'link': 'https://www.hltv.org/news/32543/issaa-parts-ways-with-eternal-fire'}, {'title': 'BLAST Premier Fall Showdown Fantasy live', 'link': 'https://www.hltv.org/news/32541/blast-premier-fall-showdown-fantasy-live'}, {'title': 'FURIA win IEM Fall NA, EG claim final Major Legends spot', 'link': 'https://www.hltv.org/news/32540/furia-win-iem-fall-na-eg-claim-final-major-legends-spot'}]
  • Related