I'm trying to web scrape https://liquipedia.net/dota2/Admiral this page for all the <li>
tags that are inside an <ul>
tag that again is within a div with class mw-parser-output
that has the title
property. (I think that is what they're called in the HTML world? Like <tag property="...">
).
What would be the most elegant, simple way to do this with Cheerio? I know I could do this with some for loops and stuff, but if there was a simple way to do this, my code would be a lot cleaner.
CodePudding user response:
I'm sure glad Cheerio is like jQuery. A simple selector like this should do:
const li = $('div.mw-parser-output > ul > li[title]').toArray(); // Optionaly turn selected items into an array
Explanation of the CSS selector:
div.mw-parser-output
div
makes sure the element is that. The dot signifies that the selector is aclass
.>
Points to the immediate childul
Simpleul
tagli[title]
Anyli
tag, but it needs to have the title attribute.
Then we turn the result into an array so it become usable.
It's a simple as that.
You could also get an array of the text of each li
element with the following:
const arrayOfLiTexts = li.map($el => $el.text());
CodePudding user response:
https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors
const elements = $('div[title].mw-parser-output ul li').toArray();