I am testing this code.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
d = webdriver.Chrome('C:\\Utility\\chromedriver.exe')
d.get('https://developers.humana.com/Resource/PCTFilesList?fileType=innetwork')
# stuck here...
#links =
for link in links:
d.get(link)
# click page 2, 3, 4, etc., up to 100
for page in range(1, 100)
page.click
d.quit()
So, I am trying to download CSV files on page 1, then click page 2 and download those files, and then click page 3 and again download those files. The sample code that I shared here should be a start, I think, but it definitely needs some improvements to work right. Any idea how I can do this? Thanks!
CodePudding user response:
You can use this solution:
import requests
length = 1
url = "https://developers.humana.com/Resource/GetData?fileType=innetwork&sEcho=1&iColumns=3&sColumns=,,\
&iDisplayStart=0&iDisplayLength="
r = requests.get(url str(length))
json_data = r.json()
length = json_data['iTotalRecords']
print("files ", length)
r = requests.get(url str(length))
json_data = r.json()
for e in json_data['aaData']:
download_url = "https://developers.humana.com/Resource/DownloadPCTFile?fileType=innetwork&fileName=" e['name']
print(e['name'])
print("download url: ", download_url)
then just download files in loop.
CodePudding user response:
wait = WebDriverWait(d, 20)
d.get('https://developers.humana.com/Resource/PCTFilesList?fileType=innetwork')
for i in range(2,101):
time.sleep(1)
j=i
if i>5:
j=5
#links=d.find_elements(By.CSS_SELECTOR,"a.download-pct-file-link")
#print(len(links))
#for link in links:
# link.click()
wait.until(EC.element_to_be_clickable((By.XPATH, f"//a[@data-dt-idx='{j}']"))).click()
print(f"//a[@data-dt-idx='{j}']")
I got it to go through the pages while switching the value to click to be 5 after page 5.data-dx-idx went from 2-5 then stayed at 5.You can most likely do it without time.sleep() if you handle the stales.
Import:
import time