Home > Enterprise >  Selenium loop, table missing first row and last column
Selenium loop, table missing first row and last column

Time:11-08

I am trying to get the table shown in the url below. But the result is missing the first row (2021-11) and the last column (净投放(亿)). What have I done wrong?

from selenium import webdriver

driver = webdriver.Firefox(executable_path = '/usr/local/bin/geckodriver')
driver.get('http://www.chinamoney.com.cn/chinese/hb/')

rws = driver.find_elements_by_xpath("//table/tbody/tr")
r = len(rws)

cols = driver.find_elements_by_xpath("//thead/tr/td")
c = len(cols)

element = []
row = []
for i in range(1,r):
    for j in range(1,c):
        d=driver.find_element_by_xpath("//tr[" str(i) "]/td[" str(j) "]").text
        row.append(d)

element.append(row)
driver.close()

element

CodePudding user response:

  1. You should use ranges from 0, not from 1
  2. Also you should add some wait to let the page fully loaded before accessing the tables content.
    I will add a simple delay here but t's better to use explicit waits. Please see if the following code work better for you:
from selenium import webdriver

driver = webdriver.Firefox(executable_path = '/usr/local/bin/geckodriver')
driver.get('http://www.chinamoney.com.cn/chinese/hb/')

time.sleep(5)
rws = driver.find_elements_by_xpath("//table/tbody/tr")
r = len(rws)

cols = driver.find_elements_by_xpath("//thead/tr/td")
c = len(cols)

element = []
row = []
for i in range(r):
    for j in range(c):
        d=driver.find_element_by_xpath("//tr[" str(i) "]/td[" str(j) "]").text
        row.append(d)

element.append(row)
driver.close()

element

CodePudding user response:

Since the for loop starts from index 1, the element at index 0 is skipped.

And the last two rows of the table are empty.

To get the all the values from the table, try like below:

driver.get("http://www.chinamoney.com.cn/chinese/hb/")

rows = driver.find_elements_by_xpath("//table//tr")
element = []

for row in rows:
    columns = row.find_elements_by_xpath(".//td") # Put a dot in the xpath to get an element from an element.
    list_row = []
    for col in columns:
        list_row.append(col.text)
    element.append(list_row)
print(element)
[['日期', '投放量(亿)', '回笼量(亿)', '净投放(亿)'], ['2021-11', '2200', '12200', '-10000'], ['2021-10', '13900', '12300', '1600'], ['2021-09', '11800', '5900', '5900'], ['2021-08', '4200', '2600', '1600'], ['2021-07', '2600', '3200', '-600'], ['2021-06', '3100', '2100', '1000'], ['2021-05', '1900', '2000', '-100'], ['2021-04', '2200', '2100', '100'], ['2021-03', '2400', '2500', '-100'], ['2021-02', '8300', '11440', '-3140'], ['2021-01', '10740', '12450', '-1710'], ['2020-12', '9150', '8900', '250'], ['2020-11', '17550', '17200', '350'], ['', '', '', ''], ['', '', '', '']]
  • Related