Dropbox's ATF - How functions/callbacks are stored in database?-CodePudding

I am reading about dropbox's Async. Task Framework and its architecture from dropbox tech blog: https://dropbox.tech/infrastructure/asynchronous-task-scheduling-at-dropbox

The architecture seems to be clear to me but what I can't understand is how the callbacks (or lambda in their terminology) can be stored in the database for later execution? Because they are just normal programming language functions right? Or am I missing something here?

Also,

It would need to support nearly 100 unique async task types from the start, again with room to grow.

It seems that here they are talking about types of lambda here. But how that is even possible when the user can provide arbitrary code in the callback function?

Any help would be appreciated. Thanks!

CodePudding user response：

Let me share with how this is done in case of Hangfire, which is a popular job scheduler in .NET world. I use this as an example, because I have some experience with it and its source code is publicly available on github.

Enqueueing a recurring job

RecurringJob.AddOrUpdate(() => Console.WriteLine("Transparent!"), Cron.Daily);

The RecurringJob class defines several overloads for AddOrUpdate to accept different methodCall parameters:

Expression<Action>: Synchronous code without any parameter
Expression<Action<T>>: Synchronous code with a single parameter
Expression<Func<Task>>: Asynchronous code without any parameter
Expression<Func<T, Task>>: Asynchronous code with a single parameter

The overloads are anticipating not just a delegate (a Func or an Action) rather an Expression, because it allows to Hangfire to retrieve meta information about

the type on which
- the given method should be called
  - with what parameter(s)

Retrieving meta data

There is a class called Job which exposes several FromExpression overloads. All of them are calling this private method which does all the heavy lifting. It retrieves the type, method and argument meta data.

From the above example this FromExpression retrieves the following data:

type: System.Console, mscorlib
method: WriteLine
parameter type: System.String
argument: "Transparent!"

These information will be stored inside the Job's properties: Type, Method and Args.

Serializing meta info

The RecurringJobManager receives this job and passes to a transaction via a RecurringJobEntity wrapper to perform an update if the definition of the job has changed or it was not registered at all.

Inside its GetChangedFields method is where the serialization is done via a JobHelper and a InvocationData classes. Under the hood they are using Newtonsoft's json.net to perform the serialization.

Back to our example, the serialized job (without the cron expression) looks something like this

{
   "t":"System.Console, mscorlib",
   "m":"WriteLine",
   "p":[
      "System.String"
   ],
   "a":[
      "Transparent!"
   ]
}

This is what persisted inside the database and read from it whenever the job needs to be triggered.

CodePudding user response：

I found the answer from the article itself. The core ATF framework just defines the type of tasks/callbacks it supports (e.g. Send email is a type of task) and creates corresponding SQS queues for them (for each task, there are multiple queues for different priorities).

The user (who schedules the task) does not provide function definition while scheduling the task. It only provides details of the function/callback that it wants to schedule. Those details will be pushed to the SQS queue and it's user's responsibility to create worker machines which listens for the specific type of tasks on SQS and also has the function/callback definition (e.g. the actual logic of sending email).

Therefore, there is no need to store the function definition in the database. Here's the exact section from the article that describes this: https://dropbox.tech/infrastructure/asynchronous-task-scheduling-at-dropbox#ownership-model

Ownership model
ATF is designed to be a self-serve framework for developers at Dropbox. The design is very intentional in driving an ownership model where lambda owners own all aspects of their lambdas’ operations. To promote this, all lambda worker clusters are owned by the lambda owners. They have full control over operations on these clusters, including code deployments and capacity management. Each executor process is bound to one lambda. Owners have the option of deploying multiple lambdas on their worker clusters simply by spawning new executor processes on their hosts.