Home > OS >  Selenium invalid Xpath
Selenium invalid Xpath

Time:10-02

I am using selenium to extract the half-hourly Wheather data from the table in https://www.wunderground.com/history/daily/gb/london/EGLC/date/2018-01-1

Selenium cannot find the data in the given XPaths, for instance: t = driver.find_elements_by_xpath("//table[1]/tbody[1]/tr/td[1]/lib-display-unit/span[1]/span[1]")

Can someone help me identify where the bug is?

import pandas as pd
from selenium import webdriver
import time

driver = webdriver.Chrome(executable_path='C:/Users/jdmba/chromedriver.exe')
Fromdate = '2018-01-1'

def Wheathertemp(date, driver):
    url = "https://www.wunderground.com/history/daily/gb/london/EGLC/date/" date
    driver.get(url)
    time.sleep(10)

    t = driver.find_elements_by_xpath(
        "//table[1]/tbody[1]/tr/td[1]/lib-display-unit/span[1]/span[1]")
    temperature = driver.find_elements_by_xpath(
        "//table[1]/tbody[1]/tr/td[2]/lib-display-unit/span[1]/span[1]")

    Wheather = []

    for i in range(1, len(temperature)):

        data = {'SP': str(i), 'timeObservation': t[i].text, 'temperature': temperature[i].text}
        Wheather.append(data)
    
    driver.quit()

    return Wheather


s = Wheathertemp(Fromdate, driver)
print(s)

CodePudding user response:

The web site that you use does API call in order to get the data.

You can do that as well :-)

Use the code below.

import requests
r = requests.get('https://api.weather.com/v1/location/EGLC:9:GB/observations/historical.json?apiKey=e1f10a1e78da46f5b10a1e78da96f525&units=e&startDate=20180101&endDate=20180101')
if r.status_code == 200:
    print(r.json())
else:
    print(f'Oops. status code is: {r.status_code}')

In the browser do: F12 -> Network -> XHR and you will see the API call.

  • Related