Home > Software engineering >  Node JS Use Cheerio to scrape a youtube video
Node JS Use Cheerio to scrape a youtube video

Time:10-15

I am trying to develop a system with node.js that updates the video title with the number of views every 60 seconds. Now, I'm trying to get the number of views and then I'll do the rest. I'm having trouble with the Cheerio API. I then take the response with the page source like this:

console.log("Strumento avviato!")


//TODO: get views number with scraping

const urlV = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX';
const axios = require("axios").default;
const cheerio = require('cheerio');
const request = require('request');
    request({
        method: 'GET',
        url: urlV
    }, (err, res, body) => {
        let $ = cheerio.load(body);
        let views = $('.view-count style-scope ytd-video-view-count-renderer');
        console.log(views.text());
    
    })

The tag that contains the number of views is the following:

enter image description here

The problem is that by doing this, return null:

let $ = cheerio.load(body);
    let views = $('.view-count style-scope ytd-video-view-count-renderer');
    console.log(views.text());

Two blank lines as console output: enter image description here

How could I then extract the number of views?

CodePudding user response:

Please don't crawl the DOM to get the view count. You can use the Youtube API to grab the statistics and just parse the json it returns.

I've tested the following, which works:

var options = {
  method: 'GET',
  json: true,
  url: 'https://www.googleapis.com/youtube/v3/videos',
  headers: {
    'Referer': 'YOUR DOMAIN URL'
  },
  qs: {
    part: 'statistics',
    id: 'dQw4w9WgXcQ',
    key: 'YOUR API KEY'
  }
};
request(options, function(err, res, body){
  console.log(body.items[0].statistics.viewCount);
});

You can get an API key for the Youtube API by following these instructions. You can set the Referer domain to e.g. http://localhost if you're running this instance locally or whatever domain you control. Just replace YOUR DOMAIN URL with http://localhost or whatever and YOUR API KEY with the API key Google gives you.

  • Related