Home > other >  Selenium download a pdf to a specified path
Selenium download a pdf to a specified path

Time:12-30

I have created a program which will go to a certain website, enter a folio number and then go down the page and click to generate a PDF. I will now be needing to download that pdf to a specified path.

I have been able to generate the PDF. However I am sending keys but it is not working.

PDF file url: https://gisweb.miamidade.gov/SPTXLienLetters/ReportPage.aspx?folio=0131230371470&pSecurityGuardDistrict=&pStreetLightDistrict=&pMultiPurposeDistrict=&pMunicipalityForDistrict=MIAMI&pAddressForDistrict=1400 NW 44 ST

Code used to go to page and generate pdf:

options = webdriver.ChromeOptions()
options.headless = True
driver = webdriver.Chrome(ChromeDriverManager().install(), options=options)
driver.implicitly_wait(10)

#set chromedriver.exe path 
driver = webdriver.Chrome('driver/chromedriver.exe')
driver.maximize_window()
#driver.execute_script("document.body.style.zoom='75%'")

#Launch Url
driver.get('https://gisweb.miamidade.gov/SPTXLienLetters/')

driver.find_element_by_xpath('//*[@id="divSplashScreenContent"]/table/tbody/tr[2]/td/div/div').click()

#Take data from config file
file = open('configsa.txt')
lines = file.readlines()
folio_number = lines[0]

driver.implicitly_wait(30)

#Find elements and take snapshots
elementID = driver.find_element_by_xpath('//*[@id="txtAddress"]')
elementID.send_keys(folio_number)

elementID = driver.find_element_by_xpath('//*[@id="tdDivAddress"]/table/tbody/tr/td[2]').click()

elementID = driver.find_element_by_xpath('//*[@id="trOtherAppLinks"]/td/div/span[1]/a').click()

I have tried to use the following code for save but to no avail:

action_chains = ActionChains(driver)
action_chains.send_keys(Keys.CONTROL).send_keys('s').perform()

OR

saveas = ActionChains(driver).key_down(Keys.CONTROL).send_keys('S').key_up(Keys.CONTROL).send_keys('MyDocumentName').key_down(Keys.ALT).send_keys('S').key_up(Keys.ALT)

None of the above codes are working. Can anyone help regarding this?

CodePudding user response:

After clicking on "Generate and Print Property Lien Letter" label it opens a new tab where the generated document is presented.
You have to switch to a new tab, then perform CONTROL S there to save the document, close the new tab and switch back to the original window:

window1 = driver.window_handles[0]
elementID = driver.find_element_by_xpath('//*[@id="txtAddress"]')
elementID.send_keys(folio_number)

elementID = driver.find_element_by_xpath('//*[@id="tdDivAddress"]/table/tbody/tr/td[2]').click()

elementID = driver.find_element_by_xpath('//*[@id="trOtherAppLinks"]/td/div/span[1]/a').click()
window2 = driver.window_handles[1]
driver.switch_to_window(window2)
action_chains = ActionChains(driver)
action_chains.send_keys(Keys.CONTROL).send_keys('s').perform()
driver.close()
driver.switch_to_window(window1)

CodePudding user response:

When you click on Generate and Print Property Lien Letter, a new tab is launched. Selenium does not know anything about that, so it will expect you to let Selenium know that you want to switch to new tab:

current_windows = driver.current_window_handle
elementID = driver.find_element_by_xpath('//*[@id="trOtherAppLinks"]/td/div/span[1]/a').click()
all_handles = driver.window_handles
driver.switch_to.window(all_handles[1])
print('Switch to new tab has been successful.')

Also, CTRL s will prompt a download box, you will have to write not selenium code to tackle that as well.

You should not be using driver.implicitly_wait(30) multiple times in your code. Since that's an implicit wait and should be set for entire web driver life cycle.

CodePudding user response:

The reason you are not able to download the pdf is because when you open up the PDF chrome or your browser opens it up in a pdf viewer. This tab is no longer under the scope of of the selenium web driver. You know this because you cannot do an inspect on the browser and select the input box to specify the path. To download a file though you can use a much simpler approach. Once you have the PDF url, you can use the urllib library like below

import urllib.request
pdf_url = "https://gisweb.miamidade.gov/SPTXLienLetters/ReportPage.aspx?folio=0131230371470&pSecurityGuardDistrict=&pStreetLightDistrict=&pMultiPurposeDistrict=&pMunicipalityForDistrict=MIAMI&pAddressForDistrict=1400 NW 44 ST"
download_path = "local_filepath/filename.pdf"
urllib.request.urlretrieve(pdf_url, download_path)
  • Related