Home > database >  How to get the XPATH or CSS selector from dynamically loaded website to follow links?
How to get the XPATH or CSS selector from dynamically loaded website to follow links?

Time:10-22

This is a dynamically-loaded website https://www.gelbeseiten.de/suche/hotels/nürnberg. I'm trying to follow every link from the results. I found //article[@class='mod mod-Treffer']/a to follow the search result links. But the problem is this XPATH works only for a couple of links. For the rest of the others, I don't find any Selector. Because the other are using probably JS to make this action. I'm not familiar with this kind of dynamic website. So, I don't know how to get the selector from this kind of website. Any suggestions will be highly appreciated.

CodePudding user response:

I will post this as an answer, without actually giving you the code, as it might help you more in the long term.

First, load that page in browser with javascript disabled (there are ways with disabling js in browser directly, or use an extension like ublock origin, etc - look it up).

You will notice that only the first 2 hotels are fully loading - the rest are being loaded dynamically by javascript (which in this case is disabled). There are 13 hits for //article[@class='mod mod-Treffer']/a selector, while there are more hotels on that page. However, each hotel is wrapped in an <article> tag, and that tag has data-realid="[...]" attribute. The url for each hotel would be https://www.gelbeseiten.de/gsbiz/{data-realid}.

This is how you can get all those hotels' profile links.

  • Related