I am using selenium to extract the half-hourly Wheather data from the table in https://www.wunderground.com/history/daily/gb/london/EGLC/date/2018-01-1
Selenium cannot find the data in the given XPaths, for instance: t = driver.find_elements_by_xpath("//table[1]/tbody[1]/tr/td[1]/lib-display-unit/span[1]/span[1]")
Can someone help me identify where the bug is?
import pandas as pd
from selenium import webdriver
import time
driver = webdriver.Chrome(executable_path='C:/Users/jdmba/chromedriver.exe')
Fromdate = '2018-01-1'
def Wheathertemp(date, driver):
url = "https://www.wunderground.com/history/daily/gb/london/EGLC/date/" date
driver.get(url)
time.sleep(10)
t = driver.find_elements_by_xpath(
"//table[1]/tbody[1]/tr/td[1]/lib-display-unit/span[1]/span[1]")
temperature = driver.find_elements_by_xpath(
"//table[1]/tbody[1]/tr/td[2]/lib-display-unit/span[1]/span[1]")
Wheather = []
for i in range(1, len(temperature)):
data = {'SP': str(i), 'timeObservation': t[i].text, 'temperature': temperature[i].text}
Wheather.append(data)
driver.quit()
return Wheather
s = Wheathertemp(Fromdate, driver)
print(s)
CodePudding user response:
The web site that you use does API call in order to get the data.
You can do that as well :-)
Use the code below.
import requests
r = requests.get('https://api.weather.com/v1/location/EGLC:9:GB/observations/historical.json?apiKey=e1f10a1e78da46f5b10a1e78da96f525&units=e&startDate=20180101&endDate=20180101')
if r.status_code == 200:
print(r.json())
else:
print(f'Oops. status code is: {r.status_code}')
In the browser do: F12 -> Network -> XHR and you will see the API call.