Not able to download the file from websiteusing selenium python-CodePudding

I am trying to download the daily report from the website NSE-India using selenium & python.

Approach to download the daily report

Website loads with no data
After X time,page is loaded with report information
Once the page is loaded with report data,"table[@id='etfTable']" appears
Explicit wait is added in the code,to wait till the "table[@id='etfTable']" loads

Code for explicit wait

element=WebDriverWait(driver,50).until(EC.visibility_of_element_located(By.xpath,"//table[@id='etfTable']"))

Extract the onclick event using xpath

downloadcsv= driver.find_element_by_xpath("//div[@id='esw-etf']/div[2]/div/div[3]/div/ul/li/a")
Trigger the click to download the file

Full code

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options =webdriver.ChromeOptions();
prefs={"download.default_directory":"/Volumes/Project/WebScrapper/downloadData"};
options.binary_location=r'/Applications/Google Chrome 2.app/Contents/MacOS/Google Chrome'
chrome_driver_binary =r'/usr/local/Caskroom/chromedriver/94.0.4606.61/chromedriver'
options.add_experimental_option("prefs",prefs)

driver =webdriver.Chrome(chrome_driver_binary,options=options)

try:
  #driver.implicity_wait(10)
  driver.get('https://www.nseindia.com/market-data/exchange-traded-funds-etf')
  element =WebDriverWait(driver,50).until(EC.visibility_of_element_located(By.xpath,"//table[@id='etfTable']"))
  downloadcsv= driver.find_element_by_xpath("//div[@id='esw-etf']/div[2]/div/div[3]/div/ul/li/a")
  print(downloadcsv)
  downloadcsv.click()
  time.sleep(5)
  driver.close()
except:
  print("Invalid URL")

Issue i am facing.

The page is keeps on loading but when launched without selenium the daily report is getting loaded

Normal Loading via Selenium

Not able to download the daily report

CodePudding user response：

There are some syntax error in the program. Like semi-colon in few lines and while finding element using WebDriverWait, brackets are missing.

Try like below and confirm.

Can use Javascript to click on that element.

driver.get("https://www.nseindia.com/market-data/exchange-traded-funds-etf")
element =WebDriverWait(driver,50).until(EC.visibility_of_element_located((By.XPATH,"//table[@id='etfTable']/tbody/tr[2]")))


downloadcsv= driver.find_element_by_xpath("//img[@title='csv']/parent::a")
print(downloadcsv)
driver.execute_script("arguments[0].click();",downloadcsv)

CodePudding user response：

It's not an issue with your code it's an issue with the website. I checked it most of the time it did not allow me to click on the CSV file. instead of downloading the CSV file, you can scrape the table.

# for direct to the page delete cookies is very important otherwise it will deny the access

browser.delete_all_cookies()
browser.get('https://www.nseindia.com/market-data/exchange-traded-funds-etf')
sleep(5)

soup = BeautifulSoup(browser.page_source, 'html.parser')
# scrape the table from the soup