Home > OS >  How to get data from a combobox using Beautifulsoup and Python?
How to get data from a combobox using Beautifulsoup and Python?

Time:10-08

Im trying to get the selected text from a combobox. I was using asfad to search the text for the "selected" state but in the source code there are two items with that state, and I only want the one that is selected.

I am using the following code to search for it by the "span" and the id but I get the following:

login_request = session.post(url,data=payload, headers=headers, 
cookies=cookies)
sexo = BeautifulSoup(login_request.text, 'lxml').find("span",id="sexo- 
button").get_text(strip=True)
print(sexo)

Error: 'NoneType' object has no attribute 'get_text'

And it was also looking for it in the following way, obtaining a "none"

login_request = session.post(url,data=payload, headers=headers, 
cookies=cookies)
sexo = BeautifulSoup(login_request.text, 'lxml').select('select[name=sexo] option[selected]') 
[0].value
print(sexo)

Origin:

<tr ><td >Sexo</td><td colspan="3" 
>
<div ><select name="sexo" style="width: 140px; display: none;" id="sexo" 
 tabindex="0" aria-disabled="false">
<option selected="" value="">[SELECCIONAR]</option><option value="M" 
selected="">Masculino</option><option value="F">Femenino</option></select><span><a  
id="sexo-button" role="button" href="#nogo" tabindex="0" aria-haspopup="true" aria-owns="sexo- 
menu" aria-disabled="false" style="width: 140px;"><span >Masculino</span><span ></span> 
</a></span><span >*</span></div>                                                                      
</td></tr>

I just want to get the selected item which in this case is "Masculino"

CodePudding user response:

From what I can see of the html, there is no span with id="sexo- button", so BeautifulSoup(login_request.text, 'lxml').find("span",id="sexo- button") would have returned None, which is why you got the error from get_text.

As for your second attempt, I don't think bs4 Tags have a value property, which is why you'd be getting None that time.

You should actually try combining the two like:

sexo = BeautifulSoup(login_request.text, 'lxml').select_one('select[name=sexo] option[selected]').get_text(strip=True)

(If it doesn't work, you should individually print to see what ...select_one('select[name=sexo]') and ...select_one('select[name=sexo] option[selected]') will return, in case the page itself wasn't loaded properly by session)

You should also note that with the combined code, you'll actually get [SELECCIONAR], since that's also selected according to the html provided. To skip that, you can instead try:

selectedOpts = BeautifulSoup(login_request.text, 'lxml').select('select[name=sexo] option[selected]')
selectedOpts = [s for s in selectedOpts if s.get('value')]
sexo = selectedOpts[0] if selectedOpts else None
  • Related