I have to run a function on a list of items. I'm using Azure Durable Functions, and can run the items in parallel using their fan out/fan in strategy.
However, I'm wondering what the difference would be doing that vs. using the new Parallel.ForEachAsync
method within a single Activity function. I need to be using Durable functions because this is an eternal orchestration which is restarted upon completion.
CodePudding user response:
A Parallel.ForEachAsync
is bound to one Function App instance. Which means it's bound to the resources that Function App has. When running in Consumption plan, this means 1 vCPU.
When using a Fan out / Fan in approach with Durable Functions, each instance of F2 (see image) is it's own Function App instance. Which in turn can use the entirety of the resources allocated to it.
In short: with a Fan out / Fan in approach, you are using (a lot) more resources. Potentially giving you a much faster result.
Chances are you would be best off with a combination of the two: batches of work dispatched to 'F2', that processes the batched work in a parallel way.
The fan-out work is distributed to multiple instances of the
F2
function. The work is tracked by using a dynamic list of tasks.Task.WhenAll
is called to wait for all the called functions to finish. Then, theF2
function outputs are aggregated from the dynamic task list and passed to theF3
function.The automatic checkpointing that happens at the
await
call onTask.WhenAll
ensures that a potential midway crash or reboot doesn't require restarting an already completed task.
CodePudding user response:
To me the main difference is that Durable Functions will take care of failures/retries for you:
The automatic checkpointing that happens at the await call on Task.WhenAll ensures that a potential midway crash or reboot doesn't require restarting an already completed task.
In rare circumstances, it's possible that a crash could happen in the window after an activity function completes but before its completion is saved into the orchestration history. If this happens, the activity function would re-run from the beginning after the process recovers.