I'm trying to get all the <div >
which are following siblings of the div with <div >
and which has .//div/span[2]
text = Pokémon. I want to stop as soon as I come across an other <div >
.
When the dynamic page looks like this :
<div >
<div >
<div >
<span ></span>
<span >Ashes Reborn</span>
</div>
</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >
<div >
<span ></span>
<span >Yu-Gi-Oh!</span>
</div>
</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >
<div >
<span ></span>
<span >Pokémon</span>
</div>
</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
I get my data with the following code because nothing is following Pokemon tournament :
//div[preceding-sibling::div[@ and .//div/span[2][contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'),'pok')]]]
Returns all the desired divs :
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
But, the problem I'm having is that the ordering of tournaments is at random and I don't know if any other tournament will follow Pokemon's.
The webpage usually looks more like this :
<div >
<div >
<div >
<span ></span>
<span >Ashes Reborn</span>
</div>
</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >
<div >
<span ></span>
<span >Yu-Gi-Oh!</span>
</div>
</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >
<div >
<span ></span>
<span >Pokémon</span>
</div>
</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >
<div >
<span ></span>
<span >Magic: The Gathering</span>
</div>
</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >
<div >
<span ></span>
<span >Final Fantasy</span>
</div>
</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >...</div>
<div >
<div >
<span ></span>
<span >Keyforge</span>
</div>
</div>
<div >...</div>
<div >...</div>
</div>
Any help would be very appreciated !
CodePudding user response:
You can you this:
//div[not(@)][preceding-sibling::div[@][1][div/span[2][contains(.,'Pokémon')]]]
The trick is the predicate [1]
, it wil find the first preceding-sibling::div[@][1] and if that has your Pokémon string it will succeed.