I am using link-preview-js
NPM Package to fetch a website's title, description and image. However, I am not receiving correct data for a specific website.
Website: https://www.britishcouncil.pk/exam/school/your-world
Result I am getting:
{
url: 'https://www.britishcouncil.pk/exam/school/your-world',
title: 'Access Denied',
siteName: undefined,
description: undefined,
mediaType: 'website',
contentType: 'text/html',
images: [],
videos: [],
favicons: [ 'https://www.britishcouncil.pk/favicon.ico' ]
}
I need to find the right user-agent which would fetch correct data. I tried googlebot, Twitterbot and facebookexternalhit without success. How do I get correct data?
The correct data would have "Your World - Your Opportunity | British Council" as Title.
Code:
exports.fetchLinkPreview = functions.https.onRequest(async (req, res) => {
cors(req, res, async () => {
try {
const link = req.query.link;
const { getLinkPreview } = require('link-preview-js');
const linkResult = await getLinkPreview(link, {
imagesPropertyType: "og",
headers: {
"user-agent": "googlebot"
},
timeout: 10000
})
return res.send({ error: false, message: linkResult });
} catch (e) {
console.log("Error", e)
res.send({ error: true, message: "Incorrect link" })
}
})
})
CodePudding user response:
Here's a list of latest user-agents: https://whatmyuseragent.com/engines
The first one worked for me. Mozilla/5.0 (Linux; Android 11; vivo 1904; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/87.0.4280.141 Mobile Safari/537.36 VivoBrowser/8.7.0.1