Home > Software engineering >  why would you separate a celery worker and django container?
why would you separate a celery worker and django container?

Time:01-27

I am building a django app with celery. I tried composing a docker-compose without a container for the worker. In my Dockerfile for django, an entrypoint running the celery worker and django app:

...
python manage.py migrate
celery -A api worker -l INFO --detach
python manage.py runserver 0.0.0.0:8000

The celery will run using this order but not django runserver. I have seen in tutorials that they separated the django container from woker container or vice-versa. I do not see the explanation for this separation. I also observed that the two python container (django, worker) has the same volume. How can celery add tasks if it has a different environment with django? In my mind there would be two django apps (the same volume) for two containers only 1 running the runserver, and the other one running the celery worker. I do not understand the separation.

CodePudding user response:

You should aim to set up your containers to run only a single foreground process in each container, and no background processes. Even in this simple example, there are two obvious advantages: if the Celery worker fails, you can restart a standalone container, but it's invisible to Docker as a background process; and you can separately read the docker logs of the Web server and background worker without having them intertwined. At larger scale you can imagine wanting to run different numbers of Django and Celery containers depending on your load.

To make this work it's important that the entrypoint script not run the program directly. It is passed the (possibly overridden) container command as arguments, and you can use a special shell construct to run that

#!/bin/sh
./manage.py migrate
exec "$@"

In the Dockerfile, declare both the ENTRYPOINT and a default CMD to run, say, the Web server

ENTRYPOINT ["./entrypoint.sh"]  # probably unchanged, must be JSON array syntax
CMD ["./manage.py", "runserver", "0.0.0.0:8000"]

In a Compose setup, you can run multiple containers off the same image, but override the command: for a Celery worker.

version: '3.8'
services:
  web:
    build: .
    ports: ['8000:8000']
    environment:
      REDIS_HOST: redis
  worker:
    build: .
    command: celery -A api worker -l INFO
    environment:
      REDIS_HOST: redis
  redis:
    image: redis

The main application communicates with the worker via a queue in Redis (or another store), so there's no need for them to be in the same container.

CodePudding user response:

As Celery documentation mentions:

Celery communicates via messages, usually using a broker to mediate between clients and workers. To initiate a task the client adds a message to the queue, the broker then delivers that message to a worker.

Meaning the communication between the Client (Django) and Worker (Celery) are done through a message queue. Hence it does not matter if the workers and clients in separate containers or even separate machines. If the Client can access the message queue (for example using Redis or RabbitMQ) and worker can pop tasks from that queue, it will always work.

About the docker-compose part, there is no ideal standard for keeping or separating Celery and Django. You can put them in same container or not, it is up to you and what are the requirements of the project. If you are using two containers, then they need to share volumes because of the source code and any other data which are needed for executing tasks.

  • Related