Home > Software engineering >  How to evaluate a relative XPath inside another XPath in Puppeteer?
How to evaluate a relative XPath inside another XPath in Puppeteer?

Time:05-03

Here is my code where I have got the element Handle of some target divs

const puppeteer = require("puppeteer");


(async () => {
const searchString = `https://www.google.com/maps/search/restaurants/@-6.4775265,112.057849,3.67z`;


const browser = await puppeteer.launch();

const page = await browser.newPage();
await page.goto(searchString);

const xpath_expression ='//div[contains(@aria-label, "Results for")]/div/div[./a]';

await page.waitForXPath(xpath_expression);
const targetDivs = await page.$x(xpath_expression);

// const link_urls = await page.evaluate((...targetDivs) => {
//   return targetDivs.map((e) => {
//     return e.textContent;
//   });
// }, ...targetDivs);

})();

I have two relative XPath links inside these target Divs which contain related data

'link' : './a/@href'
'title': './a/@aria-label'

I have a sample of similar python code like this

from parsel import Selector

response = Selector(page_content)

results = []

for el in response.xpath('//div[contains(@aria-label, "Results for")]/div/div[./a]'):
    results.append({
        'link': el.xpath('./a/@href').extract_first(''),
        'title': el.xpath('./a/@aria-label').extract_first('')
    })

How to do it in puppeteer?

CodePudding user response:

I think you can get the href and ariaLabel property values with e.g.

   const targetDivs = await page.$x(xpath_expression);

   targetDivs.forEach(async (div, pos) => { 
     const links = await div.$x('a[@href]'); 
     const href = await (await links[0].getProperty('href')).jsonValue();
     const ariaLabel = await (await links[0].getProperty('ariaLabel')).jsonValue(); 
     console.log(pos, href, ariaLabel);
   });

These are the element properties, not the attribute values, which, in the case of href, might for instance mean you get an absolute instead of a relative URL but I haven't checked for that particular page whether it makes a difference. I am not sure the $x allows direct attribute node or even string value selection, the documentation only talks about element handles.

  • Related