Home > database >  How to get the HTML element but exclude all child elements with querySelector
How to get the HTML element but exclude all child elements with querySelector

Time:09-15

Currently using document.querySelector with puppeteer to retrieve the video links from a Tiktok account's HTML code and am having issues retrieving exactly what I need

With this code:

const grabURLs = await page.evaluate(() => {
    const pgTag = document.querySelector('.tiktok-1qb12g8-DivThreeColumnContainer.eegew6e2 div div div div div')
    return pgTag.innerHTML;
})

console.log(grabURLs)

I receive not only the href that I need but also all of the child elements below that, how do I limit it so the only innerHTML I receive is the first child?

Any help would be greatly appreciated thank you!

CodePudding user response:

You just need to do a quick search of the page for all of the URLs that point to videos.

Here's how to do it on Tiktok:

var videos = document.querySelectorAll("a[href*='/video/']");

CodePudding user response:

You can use the firstChild property.

So, the code becomes:

const grabURLs = await page.evaluate(() => {
    const pgTag = document.querySelector('.tiktok-1qb12g8-DivThreeColumnContainer.eegew6e2 div div div div div')
    return pgTag.firstChild.innerHTML;
})

console.log(grabURLs)
  • Related