Home > Software engineering >  Not able to download the file from websiteusing selenium python
Not able to download the file from websiteusing selenium python

Time:10-15

I am trying to download the daily report from the website NSE-India using selenium & python.

Approach to download the daily report

  • Website loads with no data
  • After X time,page is loaded with report information
  • Once the page is loaded with report data,"table[@id='etfTable']" appears
  • Explicit wait is added in the code,to wait till the "table[@id='etfTable']" loads

Code for explicit wait

element=WebDriverWait(driver,50).until(EC.visibility_of_element_located(By.xpath,"//table[@id='etfTable']"))

  • Extract the onclick event using xpath

    downloadcsv= driver.find_element_by_xpath("//div[@id='esw-etf']/div[2]/div/div[3]/div/ul/li/a")

  • Trigger the click to download the file

Full code

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options =webdriver.ChromeOptions();
prefs={"download.default_directory":"/Volumes/Project/WebScrapper/downloadData"};
options.binary_location=r'/Applications/Google Chrome 2.app/Contents/MacOS/Google Chrome'
chrome_driver_binary =r'/usr/local/Caskroom/chromedriver/94.0.4606.61/chromedriver'
options.add_experimental_option("prefs",prefs)

driver =webdriver.Chrome(chrome_driver_binary,options=options)

try:
  #driver.implicity_wait(10)
  driver.get('https://www.nseindia.com/market-data/exchange-traded-funds-etf')
  element =WebDriverWait(driver,50).until(EC.visibility_of_element_located(By.xpath,"//table[@id='etfTable']"))
  downloadcsv= driver.find_element_by_xpath("//div[@id='esw-etf']/div[2]/div/div[3]/div/ul/li/a")
  print(downloadcsv)
  downloadcsv.click()
  time.sleep(5)
  driver.close()
except:
  print("Invalid URL")

Issue i am facing.

  • The page is keeps on loading but when launched without selenium the daily report is getting loaded

Normal Loading via Selenium

  • Not able to download the daily report

CodePudding user response:

There are some syntax error in the program. Like semi-colon in few lines and while finding element using WebDriverWait, brackets are missing.

Try like below and confirm.

Can use Javascript to click on that element.

driver.get("https://www.nseindia.com/market-data/exchange-traded-funds-etf")
element =WebDriverWait(driver,50).until(EC.visibility_of_element_located((By.XPATH,"//table[@id='etfTable']/tbody/tr[2]")))


downloadcsv= driver.find_element_by_xpath("//img[@title='csv']/parent::a")
print(downloadcsv)
driver.execute_script("arguments[0].click();",downloadcsv)

CodePudding user response:

It's not an issue with your code it's an issue with the website. I checked it most of the time it did not allow me to click on the CSV file. instead of downloading the CSV file, you can scrape the table.

# for direct to the page delete cookies is very important otherwise it will deny the access

browser.delete_all_cookies()
browser.get('https://www.nseindia.com/market-data/exchange-traded-funds-etf')
sleep(5)

soup = BeautifulSoup(browser.page_source, 'html.parser')
# scrape the table from the soup
  • Related