I am starting coding for myself and I am blocked on a code line. Can you provide me some explications ?
I want to scrape informations from this div tag :
role = experience1_div('span', {'class' : 'mr1 t-bold'}) print(role)
Output :
[<span > <span aria-hidden="true"><!-- -->Automation Engineer - Intern<!-- --></span><span ><!-- -->Automation Engineer - Intern<!-- --></span> </span>]
How can I get only the HTML text : "Automation Engineer - Intern"
I tried this function .get_text().strip()
but it seems that the span
tag is blocking my function....
CodePudding user response:
I don't know what experience1_div
is but to get all text use role.text
role = experience1_div.find('span', {'class' : 'mr1 t-bold'})
print(role.text)
output:
Automation Engineer - InternAutomation Engineer - Intern
To get text from the first nested span, use
role.span.text
or from the second nested span
role.contents[2].text
CodePudding user response:
Main issue in provided information is that you have generated a ResultSet
- To get its text you have to pick the element directly or iterate it.
role[0].span.get_text(strip=True)
or
for e in role:
print(e.span.get_text(strip=True))
Output:
Automation Engineer - Intern
Better approach would be to select your element more specific (based on your example):
experience1_div.select_one('span.mr1.t-bold > span').get_text(strip=True)