I am trying to scrape the year from the html below (https://www.espncricinfo.com/series/indian-premier-league-2022-1298423/punjab-kings-vs-delhi-capitals-64th-match-1304110/full-scorecard). Due to the way the site is coded I have to first identify the table cell that contains the word "Season" then get the year (2022 in this example).
I thought this would get it but it doesn't. There are no errors, just no results. I've not used the following-sibling
approach before so I'd be grateful if someone could point out where I've messed up.
l.add_xpath(
'Season',
"//td[contains(text(),'Season')]/following-sibling::td[1]/a/text()")
html:
<tr >
<td >
<span >Season</span>
</td>
<td >
<span >
<a href="https://www.espncricinfo.com/ci/engine/series/index.html?season2022" >
<span >2022</span>
</a>
</span>
</td>
</tr>
CodePudding user response:
Try:
//span[contains(text(),"Season")]/../following-sibling::td/span/a/span/text()