Home > Software engineering >  Web Scraping when Table is one click away
Web Scraping when Table is one click away

Time:11-22

I am trying to extract table data from this website https://www.svk.se/om-kraftsystemet/kontrollrummet/ where I want the last segment called "Förbrukning I Sverige". I am trying to extract with this code:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
import pandas as pd

driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
driver.get("https://www.svk.se/om-kraftsystemet/kontrollrummet/")

html = driver.page_source

tables = pd.read_html(html)
data = tables[1]

driver.close()

ValueError: No tables found

The issue is that the table I want is not immediately shown, but the default is to show a graph. To display the table I need to push the "Tabell" button, which I can't do before the Error is shown. Is there a solution to this?

(Eventually, I want to extract data from multiple days from that table, so if someone wants to point me in the right direction to be able to do that I would be grateful.)

CodePudding user response:

You can try the next working example where you have to accept cookies at first then you have to click on table button using right element locator strategy along with WebDriverWait and execution of JavaScript.

from selenium import webdriver
import time
from bs4 import BeautifulSoup
import pandas as pd
from selenium.webdriver.chrome.service import Service

from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions()
options.add_experimental_option("detach", True)#optional
webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service,options=options)

data = []
driver.get('https://www.svk.se/om-kraftsystemet/kontrollrummet/')
driver.maximize_window()
time.sleep(3)

cookie = tbutton = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, '//*[@]'))).click()
tbutton = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, '(//*[@aria-controls="Agsid-3"])[2]')))
driver.execute_script("arguments[0].click();", tbutton)
time.sleep(1)

soup = BeautifulSoup(driver.page_source,"html.parser")

df = pd.read_html(str(soup))[0]
print(df)

Output:

 Minut  Sekund 00  Sekund 01  Sekund 02  ...  Sekund 56  Sekund 57  Sekund 58  Sekund 59
0   15:54  49.949 Hz  49.949 Hz  49.948 Hz  ...  49.942 Hz  49.942 Hz  49.942 Hz  49.942 Hz      
1   15:55  49.942 Hz  49.942 Hz  49.942 Hz  ...  49.941 Hz  49.941 Hz  49.942 Hz  49.942 Hz      
2   15:56  49.942 Hz  49.942 Hz  49.943 Hz  ...  49.952 Hz  49.953 Hz  49.953 Hz  49.953 Hz      
3   15:57  49.953 Hz  49.954 Hz  49.954 Hz  ...  49.942 Hz  49.942 Hz  49.942 Hz  49.942 Hz      
4   15:58  49.942 Hz  49.942 Hz  49.943 Hz  ...  49.943 Hz  49.943 Hz  49.943 Hz  49.943 Hz      
5   15:59  49.943 Hz  49.943 Hz  49.942 Hz  ...  49.959 Hz  49.959 Hz  49.959 Hz  49.959 Hz      
6   16:00  49.959 Hz   49.96 Hz   49.96 Hz  ...      50 Hz      50 Hz  50.001 Hz  50.001 Hz      
7   16:01  50.001 Hz  50.002 Hz  50.002 Hz  ...  50.025 Hz  50.025 Hz  50.026 Hz  50.026 Hz      
8   16:02  50.027 Hz  50.027 Hz  50.028 Hz  ...  50.043 Hz  50.044 Hz  50.044 Hz  50.045 Hz      
9   16:03  50.045 Hz  50.045 Hz  50.046 Hz  ...  50.046 Hz  50.046 Hz  50.047 Hz  50.047 Hz      
10  16:04  50.047 Hz  50.048 Hz  50.048 Hz  ...  50.045 Hz  50.046 Hz  50.046 Hz  50.046 Hz      
11  16:05  50.047 Hz  50.047 Hz  50.048 Hz  ...  50.035 Hz  50.035 Hz  50.035 Hz  50.034 Hz      
12  16:06  50.034 Hz  50.034 Hz  50.034 Hz  ...  50.033 Hz  50.032 Hz  50.032 Hz  50.031 Hz      
13  16:07   50.03 Hz   50.03 Hz  50.029 Hz  ...  50.027 Hz  50.026 Hz  50.026 Hz  50.025 Hz      
14  16:08  50.024 Hz  50.023 Hz  50.023 Hz  ...   50.02 Hz   50.02 Hz   50.02 Hz   50.02 Hz      
15  16:09   50.02 Hz  50.019 Hz  50.019 Hz  ...  50.013 Hz  50.013 Hz  50.013 Hz  50.014 Hz      
16  16:10  50.014 Hz  50.014 Hz  50.014 Hz  ...  50.003 Hz  50.003 Hz  50.003 Hz  50.003 Hz      
17  16:11  50.002 Hz  50.002 Hz  50.002 Hz  ...  50.019 Hz   50.02 Hz   50.02 Hz  50.021 Hz      
18  16:12  50.021 Hz  50.021 Hz  50.022 Hz  ...  50.015 Hz  50.015 Hz  50.014 Hz  50.014 Hz      
19  16:13  50.014 Hz  50.014 Hz  50.013 Hz  ...  50.002 Hz  50.001 Hz  50.001 Hz  50.001 Hz      
20  16:14  50.001 Hz  50.001 Hz  50.001 Hz  ...   50.02 Hz   50.02 Hz   50.02 Hz   50.02 Hz      
21  16:15   50.02 Hz   50.02 Hz   50.02 Hz  ...  50.015 Hz  50.015 Hz  50.015 Hz  50.015 Hz      
22  16:16  50.015 Hz  50.015 Hz  50.015 Hz  ...  50.016 Hz  50.016 Hz  50.016 Hz  50.016 Hz      
23  16:17  50.015 Hz  50.015 Hz  50.015 Hz  ...  50.023 Hz  50.023 Hz  50.024 Hz  50.024 Hz      
24  16:18  50.024 Hz  50.024 Hz  50.025 Hz  ...  50.022 Hz  50.021 Hz  50.021 Hz  50.021 Hz      
25  16:19  50.021 Hz  50.021 Hz  50.021 Hz  ...  50.039 Hz  50.039 Hz  50.039 Hz  50.039 Hz      
26  16:20  50.038 Hz  50.038 Hz  50.038 Hz  ...  50.023 Hz  50.023 Hz  50.023 Hz  50.023 Hz      
27  16:21  50.022 Hz  50.022 Hz  50.022 Hz  ...  50.031 Hz  50.032 Hz  50.032 Hz  50.032 Hz      
28  16:22  50.032 Hz  50.032 Hz  50.032 Hz  ...  50.029 Hz  50.029 Hz  50.029 Hz   50.03 Hz      
29  16:23   50.03 Hz   50.03 Hz   50.03 Hz  ...  50.043 Hz  50.043 Hz  50.044 Hz  50.044 Hz      
30  16:24  50.044 Hz  50.044 Hz  50.045 Hz  ...  50.033 Hz  50.033 Hz  50.034 Hz  50.034 Hz      
31  16:25  50.034 Hz  50.034 Hz  50.035 Hz  ...  50.047 Hz  50.047 Hz  50.047 Hz  50.047 Hz      
32  16:26  50.047 Hz  50.047 Hz  50.047 Hz  ...  50.045 Hz  50.046 Hz  50.046 Hz  50.046 Hz      
33  16:27  50.046 Hz  50.046 Hz  50.046 Hz  ...  50.036 Hz  50.037 Hz  50.037 Hz  50.037 Hz      
34  16:28  50.037 Hz  50.038 Hz  50.038 Hz  ...   50.04 Hz   50.04 Hz   50.04 Hz   50.04 Hz      
35  16:29   50.04 Hz   50.04 Hz   50.04 Hz  ...  50.039 Hz  50.039 Hz  50.039 Hz   50.04 Hz      
36  16:30   50.04 Hz   50.04 Hz   50.04 Hz  ...  50.037 Hz  50.037 Hz  50.037 Hz  50.037 Hz      
37  16:31  50.038 Hz  50.038 Hz  50.038 Hz  ...  50.034 Hz  50.034 Hz  50.034 Hz  50.034 Hz      
38  16:32  50.033 Hz  50.033 Hz  50.033 Hz  ...  50.041 Hz  50.042 Hz  50.042 Hz  50.042 Hz      
39  16:33  50.042 Hz  50.042 Hz  50.042 Hz  ...  50.032 Hz  50.031 Hz  50.031 Hz  50.031 Hz      
40  16:34   50.03 Hz   50.03 Hz   50.03 Hz  ...  50.043 Hz  50.044 Hz  50.044 Hz  50.044 Hz      
41  16:35  50.045 Hz  50.045 Hz  50.045 Hz  ...  50.032 Hz  50.033 Hz  50.033 Hz  50.034 Hz      
42  16:36  50.034 Hz  50.035 Hz  50.035 Hz  ...  50.017 Hz  50.017 Hz  50.017 Hz  50.017 Hz      
43  16:37  50.017 Hz  50.017 Hz  50.017 Hz  ...   50.01 Hz  50.011 Hz  50.011 Hz  50.011 Hz      
44  16:38  50.012 Hz  50.012 Hz  50.012 Hz  ...  50.021 Hz  50.022 Hz  50.022 Hz  50.022 Hz      
45  16:39  50.022 Hz  50.023 Hz  50.023 Hz  ...  50.016 Hz  50.016 Hz  50.016 Hz  50.016 Hz      
46  16:40  50.016 Hz  50.016 Hz  50.016 Hz  ...  50.003 Hz  50.002 Hz  50.002 Hz  50.002 Hz      
47  16:41  50.002 Hz  50.002 Hz  50.002 Hz  ...  49.999 Hz  49.999 Hz  49.999 Hz      50 Hz      
48  16:42      50 Hz      50 Hz      50 Hz  ...  50.004 Hz  50.004 Hz  50.004 Hz  50.004 Hz      
49  16:43  50.004 Hz  50.004 Hz  50.004 Hz  ...  49.998 Hz  49.998 Hz  49.998 Hz  49.998 Hz      
50  16:44  49.999 Hz  49.999 Hz  49.999 Hz  ...  49.989 Hz  49.989 Hz  49.989 Hz  49.989 Hz      
51  16:45  49.989 Hz   49.99 Hz   49.99 Hz  ...  49.994 Hz  49.994 Hz  49.994 Hz  49.994 Hz      
52  16:46  49.994 Hz  49.994 Hz  49.994 Hz  ...  49.988 Hz  49.988 Hz  49.988 Hz  49.988 Hz      
53  16:47  49.988 Hz  49.988 Hz  49.988 Hz  ...  49.997 Hz  49.997 Hz  49.998 Hz  49.998 Hz      
54  16:48  49.998 Hz  49.999 Hz  49.999 Hz  ...  50.007 Hz  50.007 Hz  50.007 Hz  50.006 Hz
55  16:49  50.006 Hz  50.006 Hz  50.006 Hz  ...  50.002 Hz  50.002 Hz  50.002 Hz  50.002 Hz
56  16:50  50.002 Hz  50.002 Hz  50.002 Hz  ...  50.001 Hz  50.002 Hz  50.002 Hz  50.003 Hz
57  16:51  50.003 Hz  50.004 Hz  50.004 Hz  ...  50.006 Hz  50.006 Hz  50.006 Hz  50.006 Hz
58  16:52  50.006 Hz  50.006 Hz  50.006 Hz  ...  50.003 Hz  50.003 Hz  50.003 Hz  50.004 Hz
59  16:53  50.004 Hz  50.004 Hz  50.004 Hz  ...  50.007 Hz  50.007 Hz  50.007 Hz  50.008 Hz

[60 rows x 61 columns]
  • Related