Home > Back-end >  Webscraping question in python using Selenium
Webscraping question in python using Selenium

Time:07-21

I am trying to scrape using selenium in python. I want the solar data from this site and section: enter image description here

I think the problem I'm having is that the Chart data (CSV) menu option does not function as a button so clicking it doesn't work. This is what I see when I inspect the element before and after clicking it the "Chart data (CSV)" menu option.

Before: <a id="downloadRenewablesCSV" data-type="text/csv">Chart data (CSV)</a>

After: <a id="downloadRenewablesCSV" data-type="text/csv" href="data:text/csv;charset=utf8,Renewables 07/20%2 ... [alot of encoded data] ...2C209,211,211,211,212,211,211,210 " download="CAISO-renewables-20220720.csv">Chart data (CSV)</a>

originally I assumed it was just a button element that would download the csv file and was trying to do this:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Chrome(executable_path='PATH')
driver.get('https://www.caiso.com/TodaysOutlook/Pages/supply.html')
button = driver.find_element(by='xpath',value='/html/body/div[1]/div[3]/div[8]/div/div/div[2]/nav/div[3]/div/a[1]')
button.click()

This isn't working. Any advice? I am very new to selenium sorry.

CodePudding user response:

You were trying to click on download button without actually expanding the drop down, the element becomes interactable upon clicking the dropdown.

The show class is added dynamically to the div only once the <button> with text Download is clicked.

The below code should work after clicking on the dropdown button

dropdown = driver.find_element(By.XPATH, "//button[@id='dropdownMenuRenewables']")
dropdown.click()
download_b = driver.find_element(By.XPATH, "//a[@id='downloadRenewablesCSV']")
download_b.click()

This will download the file for you

CodePudding user response:

JS Path Interaction:

Xpath selectors can be a bit finicky, I would revert to the basics and try to interact with the element via the JS Path. I was able to reproduce the error and download the report using the JS Path instead. Implement the following updated code:

driver.get('https://www.caiso.com/TodaysOutlook/Pages/supply.html')
driver.execute_script("el = document.querySelector('#downloadRenewablesCSV');el.click();")
  • Related