As an exercise, I'm creating a simple API that allows users to provide a search term to retrieve links to appropriate news articles across a collection of resources. The relevent function and the route handler that uses the function is as follows:
function GetArticles(searchTerm) {
const articles = [];
//Loop through each resource
resources.forEach(async resource => {
const result = await axios.get(resource.address);
const html = result.data;
//Use Cheerio: load the html document and create Cheerio selector/API
const $ = cheerio.load(html);
//Filter html to retrieve appropriate links
$(`a:contains(${searchTerm})`, html).each((i, el) => {
const title = $(el).text();
let url = $(el).attr('href');
articles.push(
{
title: title,
url: url,
source: resource.name
}
);
})
})
return articles; //Empty array is returned
}
And the route handler that uses the function:
app.get('/news/:searchTerm', async (req, res) => {
const searchTerm = req.params.searchTerm;
const articles = await GetArticles(searchTerm);
res.json(articles);
})
The problem I'm getting is that the returned "articles" array is empty. However, if I'm not "looping over each resource" as commented in the beginning of GetArticles, but instead perform the main logic on just a single "resource", "articles" is returned with the requested data and is not empty. In other words, if the function is the following:
async function GetArticles(searchTerm) {
const articles = [];
const result = await axios.get(resources[0].address);
const html = result.data;
const $ = cheerio.load(html);
$(`a:contains(${searchTerm})`, html).each((i, el) => {
const title = $(el).text();
let url = $(el).attr('href');
articles.push(
{
title: title,
url: url,
source: resources[0].name
}
);
})
return articles; //Populated array
}
Then "articles" is not empty, as intended.
I'm sure this has to do with how I'm dealing with the asynchronous nature of the code. I've tried refreshing my knowledge of asynchronous programming in JS but I still can't quite fix the function. Clearly, the "articles" array is being returned before it's populated, but how?
Could someone please help explain why my GetArticles function works with a single "resource" but not when looping over an array of "resources"?
CodePudding user response:
Try this
function GetArticles(searchTerm) {
return Promise.all(resources.map(resource => axios.get(resource.address))
.then(responses => responses.flatMap(result => {
const html = result.data;
//Use Cheerio: load the html document and create Cheerio selector/API
const $ = cheerio.load(html);
let articles = []
//Filter html to retrieve appropriate links
$(`a:contains(${searchTerm})`, html).each((i, el) => {
const title = $(el).text();
let url = $(el).attr('href');
articles.push(
{
title: title,
url: url,
source: resource.name
}
);
})
return articles;
}))
}
The problem in your implementation was here
resources.forEach(async resource...
You have defined your function async but when result.foreach get executed and launch your async functions it doesn't wait.
So your array will always be empty.