Cheerio: find tag with multiple specific criteria easily and elegantly?-CodePudding

I'm trying to web scrape https://liquipedia.net/dota2/Admiral this page for all the <li> tags that are inside an <ul> tag that again is within a div with class mw-parser-output that has the title property. (I think that is what they're called in the HTML world? Like <tag property="...">).

What would be the most elegant, simple way to do this with Cheerio? I know I could do this with some for loops and stuff, but if there was a simple way to do this, my code would be a lot cleaner.

CodePudding user response：

I'm sure glad Cheerio is like jQuery. A simple selector like this should do:

const li = $('div.mw-parser-output > ul > li[title]').toArray(); // Optionaly turn selected items into an array

Explanation of the CSS selector:

div.mw-parser-output div makes sure the element is that. The dot signifies that the selector is a class.
> Points to the immediate child
ul Simple ul tag
li[title] Any li tag, but it needs to have the title attribute.

Then we turn the result into an array so it become usable.
It's a simple as that.

You could also get an array of the text of each li element with the following:

const arrayOfLiTexts = li.map($el => $el.text());

CodePudding user response：

https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors

const elements = $('div[title].mw-parser-output ul li').toArray();