Home > Software design >  Web scraping selector for specific element
Web scraping selector for specific element

Time:12-16

I am new to web scraping and I'm trying to use Scrapy to scrape the Release Date for the following website: https://m.imdb.com/title/tt0468569/?ref_=adv_li_tt

This is the selector I am using:

//a[contains(@class,'ipc-metadata-list-item__list-content-item ipc-metadata-list-item__list-content-item--link')]/text()

It returns too many elements and I just want the release data string.

CodePudding user response:

To select more specific and get only the text of release date adjust your path like this:

//li[contains(@data-testid,'title-details-releasedate')]//li/a/text()

It will select the <li> that contains the attribute data-testid with value title-details-releasedate. Cause these contains two <a> it focus on the <a> that is contained in another <li>

  • Related