Home > database >  BeautifulSoup can't get pre tag
BeautifulSoup can't get pre tag

Time:08-01

I need to get text from this page. But when I tried below, I get None output.

import requests
from bs4 import BeautifulSoup

url = "http://www.koeri.boun.edu.tr/sismo/2/latest-earthquakes/list-of-latest-events/"

response = requests.get(url)
html = response.content
soup = BeautifulSoup(html, "html.parser")

table = soup.find("pre")

print(table)

And I also tried

import requests
from bs4 import BeautifulSoup
from selenium import webdriver

browser = webdriver.Chrome()
browser.get("http://www.koeri.boun.edu.tr/sismo/2/latest-earthquakes/list-of-latest-events/")

html = browser.page_source
soup = BeautifulSoup(html, "html.parser")
table = soup.find("pre")

print(table)

Instead html.parser, html5lib and lxml couldn't help.

I found out that tag doesn't exist on page-source. Something about dynamic page I guess. So is there a way to access it ?

CodePudding user response:

The <pre> tag is inside <iframe>, so try to load it from iframe source URL:

import requests
from bs4 import BeautifulSoup

url = "http://www.koeri.boun.edu.tr/scripts/lasteq.asp"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
print(soup.pre)

Prints:

<pre>

RECENT EARTHQUAKES IN TURKEY

KOERI REGIONAL EARTHQUAKE-TSUNAMI MONITORING CENTER

(QUICK EPICENTER DETERMINATIONS)

                                                        Magnitude

Date       Time      Latit(N)  Long(E)   Depth(km)     MD   ML   Mw    Region

---------- --------  --------  -------   ----------    ------------    -----------

2022.08.01 07:21:57  36.8547   29.2488        1.4      -.-  1.9  -.-   SOGUTLU-FETHIYE (MUGLA)                           Quick

2022.08.01 07:03:18  37.4368   36.9718        5.0      -.-  3.1  3.2   OKSUZLU-(KAHRAMANMARAS)                           Quick

...
  • Related