I am trying to build a Xpath condition for scrapy CrawlSpider restrict_xpaths LinkExtractor which tries to go for all links in the footer, if footer does not exists, go for all the links in the body. If both exists, only go for links in the footer.
All I have right now is this
restrict_xpaths = ["//footer","//head"]
CodePudding user response:
Good answer :
restrict_xpaths = ["//footer//a | //a[not(//footer)]"]
More generally :
narrow[global contition] | wider[not(global condition)]