Issue about django handling a long request(50sec) in another request(3sec).
I have a POST request will return some infomation for user, in this request will call another api in the same app and it will query database and generate pdf report then upload to s3, it will cost about 50sec.
How can I let first request return infomation to user and generate pdf api run in background?
I have done some research, found Celery may be can handle this task, is this recommend? or anyone have advice?
Thanks in advance!!!
CodePudding user response:
Yes, this is where you'd bring in a solution like celery, rq, or huey.
In the backend you will use a server like redis which stores the state of the jobs you scheduled (and if they errored).
Of the 3 above, I highly recommend celery. It's been around longer and has better telemetry on services like sentry and Scout APM.
To get started, here's a link for First steps with Django on celery documentation website and its sample django project on GitHub.
Getting into the job mindset
Data is serialized
This means that to transport data the content will be pickled or encoded in json
Pass object IDs / naïve data to scheduled functions
Good:
book_id
as astr
, then look upBook.objects.get(pk=book_id)
inside the scheduled function. Even if it means making a redundant query - it's at least fresh data you can rely on.Dangerous: Passing an instance of a model (e.g.
book
ofBook
model) in job params. The task may simply error due to it not being serializable. Even if serialized, your data may be outdated or stale by the time the function runs.
Save the task IDs so you can look up the state of the job: this makes it possible to pin down if a job is underway already.