Home > Software engineering >  How to Scrape one of the span inside another span class?
How to Scrape one of the span inside another span class?

Time:11-20

<span class="sim-posted">
        
            <span class="jobs-status covid-icon clearfix">
                <i class="covid-home-icon"></i>Work from Home 
            </span>
            <span>Posted few days ago</span>
            
    </span>

I want to scrape last span tag with text "Posted few days ago" I have the code but its only scraping the first span with class

date_published=job.find('span',class_='sim-posted').span.text

CodePudding user response:

Try this, it will find another span without class inside the span that you reached

date_published=job.find('span',class_='sim-posted').find("span", {"class": False}).text

CodePudding user response:

To scrape the last SPAN tag with text as Posted few days ago using Selenium you can use either of the either of the following Locator Strategies:

  • Using css with last-child:

    span.sim-posted span:last-child
    
  • Using css with last-of-type:

    span.sim-posted span:last-of-type
    
  • Using css with nth-child():

    span.sim-posted span:nth-child(2)
    
  • Using css with nth-of-type():

    span.sim-posted span:nth-of-type(2)
    

CodePudding user response:

If it is always last <span> you can go with css selector last-of-type:

soup.select_one('span.sim-posted span:last-of-type').text

Example

import requests
from bs4 import BeautifulSoup

html='''
<span >
        
            <span >
                <i ></i>Work from Home 
            </span>
            <span>Posted few days ago</span>
            
    </span>
'''
soup = BeautifulSoup(html, "html.parser")

soup.select_one('span.sim-posted span:last-of-type').text

Output

Posted few days ago

Alternativ

You can also go with :-soup-contains a css pseudo class selector to target a node's text. Needs SoupSieve integration was added in Beautiful Soup 4.7.0.

soup.select_one('span.sim-posted span:-soup-contains("Posted")').text
  • Related