When using my code, HTML is coming back missing data. What can it be ?
Before, everything was working fine, until changes were made to the code for expected conditions Selenium,
Code is not all complete because it was not accepted here, but I think you can see what is happening.
navegador = webdriver.Firefox(options = options)
wait = WebDriverWait(navegador, 30)
link = '******'
navegador.get(url = link)
wait.until(EC.element_to_be_clickable((By.ID, "ctl00_ctl00_Content_Content_txtLogin"))).send_keys('******')
wait.until(EC.element_to_be_clickable((By.ID, "ctl00_ctl00_Content_Content_txtSenha"))).send_keys('******')
wait.until(EC.element_to_be_clickable((By.ID, "ctl00_ctl00_Content_Content_btnEnviar"))).click()
wait.until(EC.element_to_be_clickable((By.ID, "ctl00_ctl00_Content_Content_TreeView2t8"))).click()
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a[title='07 de dezembro']"))).click()
wait.until(EC.element_to_be_clickable((By.ID, "ctl00_ctl00_Content_Content_ddlVagasTerminalEmpresa"))).click()
wait.until(EC.element_to_be_clickable((By.ID, "ctl00_ctl00_Content_Content_ddlVagasTerminalEmpresa"))).click()
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="ctl00_ctl00_Content_Content_ddlVagasTerminalEmpresa"]/option[2]'))).click()
teste = wait.until(EC.presence_of_element_located((By.XPATH, '//*[@id="divScroll"]'))).get_attribute('innerHTML')
soup = BeautifulSoup(teste, "html.parser")
I get the following back.
<table align="center" style="border-right: #66cc00 1px solid; border-top: #66cc00 1px solid; border-left: #66cc00 1px solid; border-bottom: #66cc00 1px solid" width="100%">
<tbody><tr>
<td>
<table>
<tbody><tr>
<td >
<span id="ctl00_ctl00_Content_Content_Label1" style="font-size:12px;">Terminal - Empresa - Exportador:</span>
</td>
<td>
<select id="ctl00_ctl00_Content_Content_ddlVagasTerminalEmpresa" name="ctl00$ctl00$Content$Content$ddlVagasTerminalEmpresa" onchange="javascript:setTimeout('__doPostBack(\'ctl00$ctl00$Content$Content$ddlVagasTerminalEmpresa\',\'\')', 0)" style="width: 475px;">
<option selected="selected" value="0">Selecione um Terminal.</option>
<option value="68623">TEAG - CARGILL - 04 CARGILL AGRICOLA S A - GUARUJA - SP</option>
<option value="68594">TEG - CARGILL - 04 CARGILL AGRICOLA S A - GUARUJA - SP</option>
</select>
</td>
</tr>
</tbody></table>
</td>
</tr>
<tr>
<td >
<span id="ctl00_ctl00_Content_Content_lbl_titulo_principal" style="font-size:12px;">Disponibilização de vagas do dia: 07/12/2022</span></td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td valign="top">
</td>
</tr>
<tr>
I should get that back.
</tr>
<tr>
<td></td>
</tr>
<tr>
<td valign="top">
<div id="ctl00_ctl00_Content_Content_pn_turno_1" style="width:100%;">
<table width="100%" style="border-right: #66cc00 1px solid; border-top: #66cc00 1px solid; border-left: #66cc00 1px solid; border-bottom: #66cc00 1px solid">
<tbody><tr>
<td >
<span id="ctl00_ctl00_Content_Content_lbl_turno_1">Turno 01 - intervalo: 7/12/2022 0:00:00 as 7/12/2022 1:00:00</span></td>
</tr>
<tr>
<td style="height:200px;width: 100%;" valign="top">
<table border="0" cellpadding="4" cellspacing="2" style="font-size:14;width: 100%;z-index: -1;">
</table>
<table border="0" cellpadding="3" cellspacing="2" style="font-size:14;width: 100%">
<tbody><tr >
<td width="12%" align="center">
<span id="ctl00_ctl00_Content_Content_rpt_turno_1_ctl01_lblEmpresaTerminal_1" title="TEAG - CARGILL - 04 CARGILL AGRICOLA S A - GUARUJA - SP" style="font-size:7px;">CARGILL - TEAG</span>
<input type="image" name="ctl00$ctl00$Content$Content$rpt_turno_1$ctl01$imb_vaga_1" id="ctl00_ctl00_Content_Content_rpt_turno_1_ctl01_imb_vaga_1" title="Vaga agendada." src="../App_Themes/SisLog/Images/caminhao.png" onclick="javascript:window.open('Cadastro.aspx?id_agenda=7054462&id_turno=7/12/2022 0:00:00;7/12/2022 1:00:00&data=07/12/2022&id_turno_exportador=198574&id_turno_agenda=61348&id_transportadora=23213&id_turno_transp=68623&id_Cliente=7708&codigo_terminal=7708&codigo_empresa=1&codigo_exportador=24978&codigo_transportador=23213&codigo_turno=1&turno_transp_vg=68623','_blank','height=850,width=1000,top=(screen.width)?(screen.width-1000)/2 : 0,left=(screen.height)?(screen.height-700)/2 : 0,toolbar=no,location=no,directories=no,status=no,menubar=no,scrollbars=yes,resizable=no');" style="height:20px;border-width:0px;">
</td>
CodePudding user response:
Since you did not share a link to the page you working on we can only guess what can cause your problem.
So, I guess you are extracting the text from not fully rendered element.
To try fix this try changing from presence_of_element_located
to visibility_of_element_located
in this line teste = wait.until(EC.presence_of_element_located((By.XPATH, '//*[@id="divScroll"]'))).get_attribute('innerHTML')
so it will be
teste = wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="divScroll"]'))).get_attribute('innerHTML')
In case this will not be enough try adding some delay before extracting the text, as following:
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="divScroll"]')))
time.sleep(2)
teste = navegador.find_element(By.XPATH, '//*[@id="divScroll"]').get_attribute('innerHTML')
And in case that element is not visible so that visibility_of_element_located
can not be applied on it just use presence_of_element_located
with delay
wait.until(EC.presence_of_element_located((By.XPATH, '//*[@id="divScroll"]')))
time.sleep(2)
teste = navegador.find_element(By.XPATH, '//*[@id="divScroll"]').get_attribute('innerHTML')