I build a scraper where the scraped data gets compared with already existing data to avoid duplicates, create new entries and update old entries. I'm doing this with a for loop, which loops over a findOne function where are two awaits in. The problem is that my for loop is ignoring (because it's sync?) my awaits and goes over to a part, where it is important that all of these awaits are done.
async function comparedata(length) {
console.log("Starting comparing entries in data");
for (let x = 0; x < length; x ) {
const model = new dataModel({
link: dataLinks[x],
name: dataGames[x].replace('Download', ' '),
logo: dataLogos[x],
provider: 'data',
});
model.collection.findOne({ "link": dataLinks[x] }, async function (err, found) {
if (err) throw err;
if (found == null) {
await model.save().then((result) => {
console.log(result) // Is not happening because the for loop goes through to the next function and closes the server
}).catch((err) => { console.log(err) });
}
else if (found != null) {
if (dataGames[x] != found.name) {
await model.collection.findOneAndUpdate({ link: dataLinks[x] }, { $set: { name: dataGames[x] } });
}
}
})
}
closeServer()//Closes the server is happening before new entries or updates are made.
}
My idea was to work with promises, but even if I tried to, it was just getting resolved too fast and closes the server again.
CodePudding user response:
You should be able to simplify your logic like this:
async function comparedata(length) {
console.log('Starting comparing entries in data');
try {
for (let x = 0; x < length; x ) {
let found = await dataModel.findOne({ link: dataLinks[x] });
if (!found) {
found = await dataModel.create({
link: dataLinks[x],
name: dataGames[x].replace('Download', ' '),
logo: dataLogos[x],
provider: 'data',
});
} else if (found.name !== dataGames[x]) {
found.name = dataGames[x];
await found.save();
}
console.log(found);
}
} catch (e) {
console.log(e);
}
closeServer();
}
CodePudding user response:
For the first iteration of for loop, the findOne's callback is put in the callback queue by the event loop and it proceeds with the next iteration not waiting for the awaits, this goes till the last iteration and after the last iteration it immediately calls the closeServer(), after this closeServer() call the tasks put into the callback queue (i.e the findOne's callbacks) are considered and executed by the event loop. Inorder to get an understanding of this, you have to learn about the event loop and how the event loop executes the javascript code. Please check about the event loop working mechanism youtube video here
You can use the promise style execution of findOne and overcome this issue.
And according to me, using await & then() functionalities on the same statement is not a good practice.
My suggestion,
async function comparedata(length) {
console.log("Starting comparing entries in data");
for (let x = 0; x < length; x ) {
const model = new dataModel({
link: dataLinks[x],
name: dataGames[x].replace('Download', ' '),
logo: dataLogos[x],
provider: 'data',
});
// using .exec() at the end allows us to go with the promise way of dealing things rather than callbacks
// assuming that 'dataModel' is the Schema, so I directly called findOne on it
const found = await dataModel.findOne({ "link": dataLinks[x] }).exec();
if (found == null) {
// wrap the await in try...catch for catching errors while saving
try{
await model.save();
console.log("Document Saved Successfully !");
}catch(err) {
console.log(`ERROR while saving the document. DETAILS: ${model} & ERROR: ${err}`)
}
} else if (found != null) {
if (dataGames[x] != found.name) {
await model.collection.findOneAndUpdate({ link: dataLinks[x] }, { $set: { name: dataGames[x] } });
}
}
if (x > length)
sendErrorMail();
}
closeServer()//Closes the server is happening before new entries or updates are made.
}
NOTE: Please refer the Mongoose official documentation for updates.
CodePudding user response:
I'm not super familiar with mongoose api, but if you are unable to use a promisified version then you can use the Promise constructor to "jump" out of a callback:
const found = await new Promise((resolve, reject) => {
model.collection.findOne({ "link": dataLinks[x] }, function (err, found) {
if (err) {
reject(err);
return;
};
resolve(found);
}
});
This might help you work things out.