Home > Mobile >  How to select multiple variables from a dropdown menu in Selenium (python)
How to select multiple variables from a dropdown menu in Selenium (python)

Time:09-10

I would like to automatically download different datasets from the Climate World Bank (enter image description here

This is an example of a dataset to download.

However, I have two major problems:

  1. I am not able to select the values from the drop-down menu if a change to the timeseries tab
  2. I do not know how to select the values for subnational unit since it exist only after sub-national units is chosen for area type.

This is the code that I have written until now

from multiprocessing import Value
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import os 



wd = webdriver.Chrome('C:/Users/alber/OneDrive/Desktop/UniTn/WebD/chromedriver_win32/chromedriver.exe')
url = "https://climateknowledgeportal.worldbank.org/download-data"
wd.get(url)

download_b = wd.find_element(By.ID,'ncfile')
tab = wd.find_element(by = By.XPATH , value = '//*[@id="data-download-form-container"]/div/ul/li[3]/a')
tab.click()

#WebDriverWait(wd, 15).until(EC.presence_of_element_located((By.ID, "variable")))

select = Select(wd.find_element(by = By.ID, value = "variable"))
select.select_by_visible_text("Mean-Temperature")

select = Select(wd.find_element(by = By.ID, value = "aggregation"))
select.select_by_visible_text("Monthly")

select = Select(wd.find_element(by = By.ID, value = "type"))
select.select_by_visible_text("Sub-national units")
select = Select(wd.find_element(by = By.ID, value = "country"))
select.select_by_visible_text("Italy")
select = Select(wd.find_element(by = By.ID, value = "timeperiod"))
select.select_by_visible_text("1901 - 2021")
download_b.click()

Thank you for your help.

CodePudding user response:

Given your ultimate goal (to obtain the actual data - csv and analyse it), you may want to reconsider your strategy. You're trying to fill a form with some variables, and that form is posting those variables somewhere (to a URL you can observe in Dev tools - Network tab), and returning a result. Why not use requests for this, and avoid the overheads of selenium? The following is one way of obtaining such data, using only requests (and pandas for displaying the data):

import requests
import pandas as pd

headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.79 Safari/537.36'
}

s = requests.Session()
s.headers.update(headers)
url = 'https://climateknowledgeportal.worldbank.org/download-data'

r = s.get(url)
s.headers.update({'X-Requested-With':'XMLHttpRequest'})
payload = {
'collection':"cru",
'variable':"tas",
'aggregation':"annual",
'type':"country",
'country':"DZA",
'subnational':"",
'latitude':"",
'longitude':"",
'watershed':"",
'calculation':"",
'timeperiod':"historical",
'percentile':"",
'scenario':"",
'model':"all",
'tab':"timeseries"

}
r = s.post('https://climateknowledgeportal.worldbank.org/download_climateportal_data', data=payload)
print(r.json()['success'])
r = s.get(r.json()['success'])
with open('test_climate.csv', "wb") as f:
    f.write(r.content)
df = pd.read_csv('test_climate.csv')
display(df)

The code above is visiting the original page (to get the cookies), then posting some data (see payload object) to the api accessed by the actual form in page. That api is returning a url containing the actual data, and then we visit that url and write the data to a csv file, which we then read with pandas, and the result (also saved in a csv file) is:

                                        Variable:   tas
NaN     Algeria     Adrar   Ain-Defla   Ain-Temouchent  Alger   Annaba  Batna   Bechar  Bejaia  Biskra  Blida   Bordj Bou Arrer     Bouira  Boumerdes   Chlef   Constantine     Djelfa  El Bayadh   El Oued     El-Tarf     Ghardaia    Guelma  Illizi  Jijel   Khenchela   Laghouat    Mascara     Medea   Mila    Mostaganem  M'Sila  Naama   Oran    Ouargla     Oum El Bouaghi  Relizane    Saida   Setif   Sidi Bel Abbes  Skikda  Souk-Ahras  Tamanrasset     Tebessa     Tindouf     Tiaret  Tipaza  Tissemsilt  Tizi Ouzou  Tlemcen
1901.0  22.84   25.83   16.38   16.89   16.78   16.36   14.89   22.65   14.94   19.28   15.91   14.58   14.59   16.16   17.22   14.28   16.46   18.19   21.18   16.36   20.93   14.78   22.65   15.38   15.88   16.83   16.63   14.98   14.46   17.46   15.93   16.16   17.14   22.04   13.94   17.37   15.17   14.26   15.34   15.90   14.51   25.02   15.48   24.38   15.24   17.06   15.32   15.34   15.25
1902.0  22.84   25.77   16.64   16.98   17.06   16.60   15.13   22.44   15.24   19.50   16.18   14.87   14.87   16.46   17.47   14.53   16.71   18.28   21.35   16.61   21.06   15.00   22.63   15.68   16.07   17.06   16.78   15.24   14.71   17.66   16.18   16.17   17.30   22.17   14.16   17.61   15.32   14.53   15.43   16.17   14.73   25.05   15.68   23.98   15.47   17.32   15.57   15.65   15.27
1903.0  22.75   25.83   16.34   16.83   16.75   16.08   14.70   22.36   14.85   19.08   15.86   14.46   14.48   16.12   17.21   14.11   16.33   18.00   20.93   16.11   20.76   14.50   22.54   15.23   15.63   16.70   16.53   14.90   14.31   17.43   15.78   15.99   17.13   21.84   13.71   17.34   15.06   14.12   15.22   15.73   14.22   25.09   15.19   23.92   15.17   17.03   15.28   15.29   15.15
1904.0  22.89   25.89   16.82   17.25   17.22   16.53   15.11   22.61   15.34   19.49   16.35   14.92   14.99   16.62   17.69   14.50   16.77   18.41   21.27   16.50   21.09   14.95   22.62   15.70   15.97   17.13   16.97   15.40   14.71   17.89   16.24   16.39   17.60   22.12   14.09   17.80   15.49   14.55   15.66   16.18   14.65   25.09   15.58   23.95   15.63   17.52   15.76   15.80   15.55
...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...     ...
2017.0  23.87   26.65   18.27   18.83   18.69   18.13   16.73   23.94   16.88   21.07   17.77   16.49   16.44   18.06   19.15   16.12   18.21   19.87   22.93   18.06   22.31   16.51   23.44   17.19   17.62   18.60   18.46   16.78   16.31   19.36   17.75   17.98   19.04   23.58   15.72   19.27   17.01   16.14   17.24   17.73   16.23   25.52   17.23   25.41   17.09   18.99   17.19   17.25   17.15
2018.0  23.50   26.13   17.66   17.96   18.19   18.02   16.37   22.80   16.32   20.63   17.25   15.93   15.84   17.49   18.58   15.98   17.65   19.00   22.77   18.02   21.79   16.41   23.56   16.84   17.52   17.95   17.82   16.20   15.94   18.80   17.20   16.95   18.44   23.30   15.60   18.71   16.27   15.63   16.35   17.57   16.16   25.63   17.14   24.11   16.42   18.43   16.58   16.67   16.22
2019.0  23.66   26.47   17.83   18.36   18.36   18.06   16.51   23.39   16.53   20.79   17.43   16.15   16.07   17.70   18.76   16.03   17.82   19.39   22.81   18.02   21.99   16.44   23.43   16.98   17.54   18.16   18.05   16.38   16.09   18.97   17.41   17.45   18.67   23.41   15.63   18.88   16.55   15.82   16.76   17.63   16.18   25.51   17.15   24.71   16.63   18.62   16.73   16.90   16.67
2020.0  23.79   26.57   18.08   18.65   18.63   18.19   16.79   23.66   16.91   21.13   17.68   16.51   16.41   18.00   19.00   16.19   18.13   19.69   23.02   18.15   22.30   16.59   23.49   17.23   17.72   18.40   18.33   16.65   16.36   19.27   17.77   17.80   18.97   23.66   15.80   19.11   16.82   16.17   17.07   17.79   16.32   25.48   17.31   24.92   16.86   18.87   16.98   17.25   16.97
2021.0  23.93   26.62   18.29   18.62   18.87   18.59   17.16   23.62   17.22   21.46   17.92   16.81   16.70   18.32   19.11   16.59   18.41   19.78   23.40   18.61   22.46   16.99   23.74   17.59   18.11   18.67   18.41   16.92   16.73   19.31   18.08   17.78   18.97   23.98   16.20   19.22   16.92   16.50   17.08   18.17   16.74   25.63   17.71   24.86   17.07   19.05   17.20   17.53   16.94

You can modify the parameters of payload object, to get all the results you want.

Requests docs: https://requests.readthedocs.io/en/latest/

And also pandas: https://pandas.pydata.org/pandas-docs/stable/index.html

  • Related