Home > Software engineering >  Distributing tasks across multiple Cloud Functions
Distributing tasks across multiple Cloud Functions

Time:05-11

Let's say, I have 1000 documents in a Firestore collection.

How do I execute the same 1 Cloud Function but 10 times in parallel to process 100 documents each, say every 5 minutes?

I am aware I can use a Scheduler for the "every 5 minutes" part. The objective here is to distribute the load using multiple executions of the same function in parallel to handle the tasks. When the collection grows, I would like to add more instances. For example, let's say 1 execution per 100 documents.

I don't mind having another (or more) function to handle the distribution itself, and I don't mind the number of executions. I just don't want to loop through a large collection and process the tasks in a single function execution.

The numbers given above are examples. I am also open to using other services within GCP.

CodePudding user response:

If you wanna execute the Cloud Function every time some changes occur in the Firestore documents, then you can use Cloud Firestore Trigger in Cloud Functions. The Cloud Function basically waits for changes, triggers when an event occurs and performs its tasks. You can go through these documents on Firestore triggers: Google Cloud Firestore Trigger, Cloud Firestore Triggers.

In case you are concerned that Cloud Function will not be able to process the requests parallely, then you should check out this document. Cloud Functions handle incoming requests by assigning it to an instance, in case the volume of requests increases, the Cloud Functions will start new instances to handle the requests.

CodePudding user response:

Let's assume you have a function that, when called, process the single document and does anything you need with it. Let's call that function doSomething and let's assume it takes the document's path as parameter.

Then, you can create a function that will be scheduled every 5 minutes. In this function, you'll retrieve all the documents, holding them in an array (let's call it documents) and do something like:

const doSomething = httpsCallable(functions, 'doSomething');
let calls = [];
documents.map((document) => {
    calls.push(
        doSomething({path: document.path})
    );
});

await Promise.all(calls);

This will create an array of calls, then it will fire all the calls at once, obtaining parallel executions of the same function.

  • Related