I am trying to write a script with Node and Cheerio JS that checks a website and returns true if tickets are available for a certain teams game. The basic HTML structure of the site is
<body>
<div >
<div >
<div >
<h3>
<span >Team 1</span>
<span >Team 2</span>
</h3>
</div>
<div >
<div >Buy</div>
</div>
</div>
<div >
<div >
<h3>
<span >Team 3</span>
<span >Team 4</span>
</h3>
</div>
<div >
<div >Buy</div>
</div>
</div>
</div>
</body>
I want to use Cheerio and with an each() function, check each match class and return true of 'Team 1' is the text value of the team span and the corresponding Buy button has the class available. I have written the below so far but on my real life example, it is returning the number of total available buttons * the amount of games 'Team 1' play.
const teamFinder = ".team";
// set match day class
const match = ".match";
let MatchCount = 0;
$(match).each((MDIdx, MDElm) => {
$(teamFinder).each((parentIdx, parentElm) => {
let team = $(parentElm).text();
if (team == "Team 1") {
if ($(buttonDiv).children().hasClass("available")) {
MatchCount ;
}
}
});
});
CodePudding user response:
I think your issue is that your inner each() iterates over all elements found in html, and not current .match div.
so, instead of:
$(teamFinder).each()
you should use:
$(teamFinder, MDElm).each()
Here is the fixed example: https://scrapeninja.net/cheerio-sandbox?slug=b9db2202ea00e21f7d7197b1667802edd411ea33
a "cleaner" approach:
// define function which accepts body and cheerio as args
function extract(input, cheerio) {
// return object with extracted values
let $ = cheerio.load(input);
// extract all matches
let matches = $('.match').map((k, v) => {
return {
teams: $('.team', v).map((k, v) => $(v).text().trim()).toArray(),
available: $('.buy-info .available', v).length
}
}).toArray();
// pure JSON with matches
console.log(matches);
let matchCount = matches.filter(m => m.teams.includes('Team 1') && m.available).length;
console.log(`matchCount: ${matchCount}`);
}
https://scrapeninja.net/cheerio-sandbox?slug=2b6fd73137469289934c42294bba18fa1cbfa355