Home > Software design >  How to webscrape something that doesn't have an attribute or anything attached to it in Seleniu
How to webscrape something that doesn't have an attribute or anything attached to it in Seleniu

Time:01-04

Im trying to webscrape electricity prices from a website: https://www.nordpoolgroup.com/en/Market-data1/#/nordic/table. When I try to find the web element for the date and the price this is what it looks like:

The first date:

td  ng-if="tableData.dataType == 0" 
ng-bind-html="tableData.axis.y.items[$index].name">03-01-2023</td

Here I wanna access the 03-01-2023 which is the date but I can't seem to access it?

The full Xpath for this is: /html/body/div[2]/div/div/div[2]/div[2]/div/div[2]/div/div[2]/div[1]/div[2]/table/tbody/tr[1]/td[1]

The first price:

td ng-repeat="column in row | visibleColumns:enableFilter:tableData:visibleEntities" 
ng-  ng-bind-html="column.value">135,05</td

Here I wanna access 135,05 but how?

Xpath: /html/body/div[2]/div/div/div[2]/div[2]/div/div[2]/div/div[2]/div[1]/div[2]/table/tbody/tr[1]/td[2]

What I tried:

I tried returning the class and the tag name thinking it would return the date and the price but it didn't work instead it gave me:

selenium.webdriver.remote.webelement.WebElement (session="*random characters*",
element"*random characters*")

So it gave me some session and web element information but not the date or price

Code in Python (Tried returning the whole table of dates and prices):

from selenium import webdriver
from selenium.webdriver.common.by import By
import time
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

url = "https://www.nordpoolgroup.com/en/Market-data1/#/nordic/table"

PATH = "C:/PATH"
driver = webdriver.Chrome(PATH)

driver.get(url)

Date = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.ID, "datatable"))
        )

print(Date)

CodePudding user response:

Try the below code, it will print all the values from the table:

page_url = "https://www.nordpoolgroup.com/en/Market-data1/#/nordic/table"
driver = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()))
driver.implicitly_wait(15)
driver.maximize_window()
driver.get(page_url)

table_rows = driver.find_elements(By.XPATH, ".//tr[contains(@ng-repeat,'row in tableData')]")

table_cols = driver.find_elements(By.XPATH, "(.//tr[contains(@ng-repeat,'row in tableData')])[1]/td")

for row in range(len(table_rows)):
    for col in range(len(table_cols)):
        print(driver.find_element(By.XPATH, "(.//tr[contains(@ng-repeat,'row in tableData')])["  str(row   1)   "]/td["   str(col   1)   "]").text, end = " | ")
    print()

Output:

04-01-2023 | 87,86 | 73,41 | 73,41 | 73,41 | 73,41 | 99,96 | 58,94 | 73,41 | 109,08 | 109,08 | 109,08 | 71,86 | 71,86 | 51,79 | 101,68 | 101,68 | 101,68 | 142,75 | 88,60 | 58,94 | 97,74 | 99,89 | 
03-01-2023 | 135,05 | 101,04 | 101,04 | 132,76 | 132,76 | 132,76 | 145,95 | 132,76 | 147,07 | 147,07 | 147,07 | 103,91 | 103,91 | 75,71 | 132,76 | 132,76 | 132,76 | 147,28 | 145,95 | 145,95 | 146,05 | 145,86 | 
02-01-2023 | 127,36 | 116,51 | 116,51 | 123,26 | 123,26 | 123,26 | 124,13 | 123,43 | 141,86 | 141,86 | 141,86 | 121,02 | 121,02 | 80,72 | 123,37 | 123,37 | 123,37 | 126,36 | 125,65 | 123,82 | 124,58 | 131,08 | 
...
...
  • Related