Home > Blockchain >  How do I turn selenium responses from a list of urls to lists of responses depending on the urls the
How do I turn selenium responses from a list of urls to lists of responses depending on the urls the

Time:10-15

I currently working in Selenium with Python and I am scraping a bunch of URLs from a list. My problem comes when I want to separate the responses in lists depending on which URL it was scraped from. Currently the response is just one list of all responses as follows:

BIOGEN_SCOTLAND_ESRI.zip
Format: ESRI Shapefile, (50.7 kB)
BIOSPH_SCOTLAND_ESRI.zip
Format: ESRI Shapefile, (1.4 MB)
COUNEUR_SCOTLAND_ESRI.zip
Format: ESRI Shapefile, (35.8 kB)
CNTRYPK_SCOTLAND_ESRI.zip
Format: ESRI Shapefile, (183.2 kB)
GCR_SCOTLAND_ESRI.zip
Format: ESRI Shapefile, (3.8 MB)
LNR_SCOTLAND_ESRI.zip
Format: ESRI Shapefile, (243.5 kB)

Here is the snippet of code for the first x results that produces the mentioned result:

for dataset in dataset_index[:3]:
    driver.get(dataset['dataset_link'])

    time.sleep(2)

    filelist = driver.find_elements(By.XPATH, '//*[@id="filelist"]')

    for files in filelist:
        file_name = files.find_element(By.CLASS_NAME, 'c-download-item').text

        print(file_name)

My expected response would be something like:

[BIOGEN_SCOTLAND_ESRI.zip
Format: ESRI Shapefile, (50.7 kB)
BIOSPH_SCOTLAND_ESRI.zip
Format: ESRI Shapefile, (1.4 MB)],
[COUNEUR_SCOTLAND_ESRI.zip
Format: ESRI Shapefile, (35.8 kB)
CNTRYPK_SCOTLAND_ESRI.zip
Format: ESRI Shapefile, (183.2 kB)]

Your assistance will be highly appreciated

CodePudding user response:

You can use list comprehension:

filelist = [x.find_element(By.CLASS_NAME, 'c-download-item').text for x in driver.find_elements(By.XPATH, '//*[@id="filelist"]')]
print(filelist)
  • Related