I have tried several ways that works for other websites but not for this url.
https://www.wunderground.com/hourly/es/barcelona/IBARCE215/date/2022-07-25 Date (e.g. 2022-07-25) should be in the future
I tried
import requests
import lxml.html as lh
import pandas as pd
url = 'https://www.wunderground.com/hourly/es/barcelona/IBARCE215/date/2022-07-25'
page = requests.get(url)
doc = lh.fromstring(page.content)
tr_elements = doc.xpath('//tr')
But tr_elements is empty It works with url = 'https://www.wunderground.com/dashboard/pws/ISANSA11/table/2021-11-30/2021-11-30/daily' url = 'http://pokemondb.net/pokedex/all' But not with url = 'https://www.wunderground.com/hourly/es/barcelona/IBARCE215/date/2022-07-25'
I also tried:
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.wunderground.com/hourly/es/barcelona/IBARCE215/date/2022-07-20'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
table1 = soup.find('table', id='hourly-forecast-table')
But table is not found. It works with: url = 'https://www.worldometers.info/coronavirus/' table1 = soup.find('table', id='main_table_countries_today')
In Chrome I used “Ctrl U” and “Ctrl Shift I” to see HTML In url = 'https://www.wunderground.com/hourly/es/barcelona/IBARCE215/date/2022-07-25' I can see id='hourly-forecast-table' using “Ctrl Shift I” but not “Ctrl U”. I can not see neither in the code in soup variable. In url = 'https://www.worldometers.info/coronavirus/' I see id='main_table_countries_today' using also “Ctrl U” I guess there is something different in this website.
Thank you very much,
CodePudding user response:
have you tried using this with Selenium as well as Beautiful Soup? Get Selenium and Chromedriver and you can use it to replicate the keystrokes you use like "Ctrl U" using Selenium's send_key
function.
CodePudding user response:
If you like the method selenium with pandas,then the next example is for you. I use selenium with pandas to grab the table data because it's loaded by JavaScript.
import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("detach", True)
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),options=options)
table=driver.get('https://www.wunderground.com/hourly/es/barcelona/IBARCE215/date/2022-07-20')
table = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, '(//table)[1]'))).get_attribute("outerHTML")
df = pd.read_html(table)[0]
print(df.iloc[:-1])
Output:
Time Conditions Temp. Feels Like ... Dew Point Humidity Wind Pressure
0 12:00 am Clear 78 °F 78 °F ... 73 °F 87 °% 3 °mph NNE 30.02 °in
1 1:00 am Clear 77 °F 77 °F ... 72 °F 85 °% 3 °mph NNW 30.02 °in
2 2:00 am Clear 77 °F 81 °F ... 71 °F 81 °% 3 °mph NNW 30.02 °in
3 3:00 am Clear 77 °F 81 °F ... 70 °F 79 °% 3 °mph N 30.02 °in
4 4:00 am Clear 76 °F 80 °F ... 69 °F 78 °% 3 °mph N 30.01 °in
5 5:00 am Clear 76 °F 79 °F ... 67 °F 76 °% 4 °mph NNW 30.01 °in
6 6:00 am Clear 75 °F 77 °F ... 66 °F 74 °% 5 °mph N 30.02 °in
7 7:00 am Sunny 75 °F 76 °F ... 67 °F 76 °% 4 °mph N 30.03 °in
8 8:00 am Sunny 77 °F 81 °F ... 68 °F 73 °% 6 °mph NNE 30.05 °in
9 9:00 am Sunny 80 °F 84 °F ... 69 °F 69 °% 7 °mph NE 30.06 °in
10 10:00 am Sunny 81 °F 87 °F ... 71 °F 71 °% 9 °mph ENE 30.08 °in
11 11:00 am Sunny 82 °F 88 °F ... 72 °F 72 °% 11 °mph E 30.09 °in
12 12:00 pm Sunny 82 °F 88 °F ... 72 °F 72 °% 12 °mph E 30.10 °in
13 1:00 pm Sunny 82 °F 88 °F ... 71 °F 70 °% 12 °mph ESE 30.10 °in
14 2:00 pm Sunny 83 °F 88 °F ... 71 °F 68 °% 12 °mph ESE 30.10 °in
15 3:00 pm Sunny 82 °F 88 °F ... 71 °F 68 °% 12 °mph ESE 30.09 °in
16 4:00 pm Mostly Sunny 83 °F 88 °F ... 71 °F 68 °% 12 °mph ESE 30.09 °in
17 5:00 pm Mostly Sunny 82 °F 88 °F ... 71 °F 70 °% 11 °mph ESE 30.08 °in
18 6:00 pm Sunny 82 °F 87 °F ... 71 °F 70 °% 10 °mph ESE 30.07 °in
19 7:00 pm Mostly Sunny 81 °F 87 °F ... 71 °F 72 °% 9 °mph ESE 30.07 °in
20 8:00 pm Sunny 80 °F 85 °F ... 71 °F 73 °% 8 °mph ESE 30.07 °in
21 9:00 pm Sunny 80 °F 84 °F ... 71 °F 76 °% 7 °mph E 30.08 °in
22 10:00 pm Clear 79 °F 83 °F ... 71 °F 77 °% 5 °mph E 30.09 °in
23 11:00 pm Clear 78 °F 82 °F ... 71 °F 79 °% 3 °mph ENE 30.10 °in
[24 rows x 11 columns]