Not able to find the Xpath-CodePudding

I am trying to scrape IMDB top 250 movies using scrapy and stuck in finding the xpath for duration[I need to extract "2","h","44" and "m"] of each movie. Website link : https://www.imdb.com/title/tt15097216/?ref_=adv_li_tt

Here's the image of the HTML:

I've tried this Xpath but it's not accurate:

//li[@class ='ipc-inline-list__item']/following::li/text()

CodePudding user response：

If it's always in the same position, what about:

//li[@class ='ipc-inline-list__item']/following::li[2]

or more simply:

//li[@class ='ipc-inline-list__item'][3]

or since the others have hyperlinks as the child, filter to just the li that has text() child nodes:

//li[@class ='ipc-inline-list__item'][text()]

However, the original XPath may be fine - it may be how you are consuming the information. If you are using .get() then try .getAll() instead.

CodePudding user response：

You can use this XPath to locate the element:

//span[contains(@class,'Runtime')]

To extract the text you can use this:

//span[contains(@class,'Runtime')]/text()