I am currently working on a scraping project in which I am scraping Google videos page, but I got stuck at a problem when I got to know that it requires to use ytimg to scrape Youtube directed video thumbnails.
I am using cheerio for parsing, here is my code:
$('.G6SP0b').each((i,el) => {
img[i] = $(el)
.find('.h1hFNe').attr('src');//which is the base64 image
})
So in the code I am extracting every thumbnails from google videos page, which are in gif/base64 format, but I want to store each Youtube directed video thumbnail into the image array with their respective URL in jpg format.
Please correct my code, or suggest me any other way of converting.
CodePudding user response:
So I just used this method, I scraped the URL of that Youtube video and used this URL for getting thumbnail:
http://img.youtube.com/vi/videoID/mqdefault.jpg
And here is my code:
$('.G6SP0b').each((i,el) => {
image[i] = $(el)
.find('.h1hFNe').attr('src')
})
$('.egMi0').each((i,el) => {
link[i] = $(el)
.find('a').attr('href')
link[i] = link[i].replace("?" , "?")
link[i] = link[i].replace("=" , "=")
link[i] = link[i].substring(7,link[i].indexOf("&"))
if(link[i].includes("www.youtube.com"))
{
image[i] = `http://img.youtube.com/vi/${link[i].substring(32)}/mqdefault.jpg`
}
})
And that is how I am getting my results:
"videoResults": [
{
"thumbnail": "http://img.youtube.com/vi/ET0G1FYxWqc/mqdefault.jpg",
"link": "https://www.youtube.com/watch?v=ET0G1FYxWqc"
},
{
"thumbnail": "http://img.youtube.com/vi/-QXrYIHODzE&vl=en-US/mqdefault.jpg",
"link": "https://www.youtube.com/watch?v=-QXrYIHODzE&vl=en-US"
},
{
"thumbnail": "",
"link": "https://www.espn.com/nfl/story/_/id/33822691/nfl-draft-2022-national-title-super-bowl-winning-qb-record-draft-night-part-impressive-year-georgia-bulldogs"
},
]