Would giving response to client while letting asynchronous operation continue to run a good idea?-CodePudding

So I need to implement an "expensive" API endpoint. Basically, the user/client would need to be able to create a "group" of existing users.

So this "create group" API would need to check that each users fulfill the criteria, i.e. all users in the same group would need to be from the same region, same gender, within an age group etc. This operation can be quite expensive, especially since there are no limit on how many users in one group, so its possible that the client requests group of 1000 users for example.

My idea is that the endpoint will just create entry in database and mark the "group" as pending, while the checking process is still happening, then after its completed, it will update the group status to "completed" or "error" with error message, then the client would need to periodically fetch the status if its still pending.

My implementation idea is something along this line

const createGroup = async (req, res) => {
    const { ownerUserId, userIds } = req.body;

    // This will create database entry of group with "pending" status and return the primary key
    const groupId = await insertGroup(ownerUserId, 'pending');

    // This is an expensive function which will do checking over the network, and would take 0.5s per user id for example
    // I would like this to keep running after this API endpoint send the response to client
    checkUser(userIds)
        .then((isUserIdsValid) => {
            if (isUserIdsValid) {
                updateGroup(groupId, 'success');
            } else {
                updateGroup(groupId, 'error');
            }
        })
        .catch((err) => {
            console.error(err);
            updateGroup(groupId, 'error');        
        });


    // The client will receive a groupId to check periodically whether its ready via separate API
    res.status(200).json({ groupId });
};

My question is, is it a good idea to do this? Do I missing something important that I should consider?

CodePudding user response：

Yes, this is the standard approach to long-running operations. Instead of offering a createGroup API that creates and returns a group, think of it as having an addGroupCreationJob API that creates and returns a job.

Instead of polling (periodically fetching the status to check whether it's still pending), you can use a notification API (events via websocket, SSE, webhooks etc) and even subscribe to the progress of processing. But sure, a check-status API (via GET request on the job identifier) is the lowest common denominator that all kinds of clients will be able to use.

Did I not consider something important?

Failure handling is getting much more complicated. Since you no longer create the group in a single transaction, you might find your application left in some intermediate state, e.g. when the service crashed (due to unrelated things) during the checkUser() call. You'll need something to ensure that there are no pending groups in your database for which no actual creation process is running. You'll need to give users the ability to retry a job - will insertGroup work if there already is a group with the same identifier in the error state? If you separate the group and the jobs into independent entities, do you need to ensure that no two pending jobs are trying to create the same group? Last but not least you might want to allow users to cancel a currently running job.