I am trying to get the table shown in the url below. But the result is missing the first row (2021-11) and the last column (净投放(亿)). What have I done wrong?
from selenium import webdriver
driver = webdriver.Firefox(executable_path = '/usr/local/bin/geckodriver')
driver.get('http://www.chinamoney.com.cn/chinese/hb/')
rws = driver.find_elements_by_xpath("//table/tbody/tr")
r = len(rws)
cols = driver.find_elements_by_xpath("//thead/tr/td")
c = len(cols)
element = []
row = []
for i in range(1,r):
for j in range(1,c):
d=driver.find_element_by_xpath("//tr[" str(i) "]/td[" str(j) "]").text
row.append(d)
element.append(row)
driver.close()
element
CodePudding user response:
- You should use ranges from 0, not from 1
- Also you should add some wait to let the page fully loaded before accessing the tables content.
I will add a simple delay here but t's better to use explicit waits. Please see if the following code work better for you:
from selenium import webdriver
driver = webdriver.Firefox(executable_path = '/usr/local/bin/geckodriver')
driver.get('http://www.chinamoney.com.cn/chinese/hb/')
time.sleep(5)
rws = driver.find_elements_by_xpath("//table/tbody/tr")
r = len(rws)
cols = driver.find_elements_by_xpath("//thead/tr/td")
c = len(cols)
element = []
row = []
for i in range(r):
for j in range(c):
d=driver.find_element_by_xpath("//tr[" str(i) "]/td[" str(j) "]").text
row.append(d)
element.append(row)
driver.close()
element
CodePudding user response:
Since the for loop starts
from index 1
, the element at index 0
is skipped.
And the last two rows of the table are empty.
To get the all the values from the table, try like below:
driver.get("http://www.chinamoney.com.cn/chinese/hb/")
rows = driver.find_elements_by_xpath("//table//tr")
element = []
for row in rows:
columns = row.find_elements_by_xpath(".//td") # Put a dot in the xpath to get an element from an element.
list_row = []
for col in columns:
list_row.append(col.text)
element.append(list_row)
print(element)
[['日期', '投放量(亿)', '回笼量(亿)', '净投放(亿)'], ['2021-11', '2200', '12200', '-10000'], ['2021-10', '13900', '12300', '1600'], ['2021-09', '11800', '5900', '5900'], ['2021-08', '4200', '2600', '1600'], ['2021-07', '2600', '3200', '-600'], ['2021-06', '3100', '2100', '1000'], ['2021-05', '1900', '2000', '-100'], ['2021-04', '2200', '2100', '100'], ['2021-03', '2400', '2500', '-100'], ['2021-02', '8300', '11440', '-3140'], ['2021-01', '10740', '12450', '-1710'], ['2020-12', '9150', '8900', '250'], ['2020-11', '17550', '17200', '350'], ['', '', '', ''], ['', '', '', '']]