I'm currently trying to run headless chrome with selenium on m1 mac host / amd64 ubuntu container.
Because arm ubuntu does not support google-chrome-stable package, I decided to use amd64 ubuntu base image.
But it does not work. getting some error.
worker_1 | [2021-10-31 03:58:23,286: DEBUG/ForkPoolWorker-10] POST http://localhost:43035/session {"capabilities": {"firstMatch": [{}], "alwaysMatch": {"browserName": "chrome", "pageLoadStrategy": "normal", "goog:chromeOptions": {"extensions": [], "args": ["--no-sandbox", "--disable-dev-shm-usage", "--disable-gpu", "--remote-debugging-port=9222", "--headless"]}}}, "desiredCapabilities": {"browserName": "chrome", "pageLoadStrategy": "normal", "goog:chromeOptions": {"extensions": [], "args": ["--no-sandbox", "--disable-dev-shm-usage", "--disable-gpu", "--remote-debugging-port=9222", "--headless"]}}}
worker_1 | [2021-10-31 03:58:23,330: DEBUG/ForkPoolWorker-10] Starting new HTTP connection (1): localhost:43035
worker_1 | [2021-10-31 03:58:41,311: DEBUG/ForkPoolWorker-12] http://localhost:47089 "POST /session HTTP/1.1" 500 717
worker_1 | [2021-10-31 03:58:41,412: DEBUG/ForkPoolWorker-12] Finished Request
worker_1 | [2021-10-31 03:58:41,825: WARNING/ForkPoolWorker-12] Error occurred while initializing chromedriver - Message: unknown error: unable to discover open window in chrome
worker_1 | (Session info: headless chrome=95.0.4638.69)
worker_1 | Stacktrace:
worker_1 | #0 0x004000a18f93 <unknown>
worker_1 | #1 0x0040004f3908 <unknown>
worker_1 | #2 0x0040004d3cdf <unknown>
worker_1 | #3 0x00400054cabe <unknown>
worker_1 | #4 0x004000546973 <unknown>
worker_1 | #5 0x00400051cdf4 <unknown>
worker_1 | #6 0x00400051dde5 <unknown>
worker_1 | #7 0x004000a482be <unknown>
worker_1 | #8 0x004000a5dba0 <unknown>
worker_1 | #9 0x004000a49215 <unknown>
worker_1 | #10 0x004000a5efe8 <unknown>
worker_1 | #11 0x004000a3d9db <unknown>
worker_1 | #12 0x004000a7a218 <unknown>
worker_1 | #13 0x004000a7a398 <unknown>
worker_1 | #14 0x004000a956cd <unknown>
worker_1 | #15 0x004002b29609 <unknown>
worker_1 |
worker_1 | [2021-10-31 03:58:41,826: WARNING/ForkPoolWorker-12]
worker_1 |
worker_1 | [2021-10-31 03:58:41,867: DEBUG/ForkPoolWorker-11] http://localhost:58147 "POST /session HTTP/1.1" 500 717
worker_1 | [2021-10-31 03:58:41,907: DEBUG/ForkPoolWorker-11] Finished Request
worker_1 | [2021-10-31 03:58:41,946: DEBUG/ForkPoolWorker-12] Using selector: EpollSelector
worker_1 | [WDM] -
worker_1 |
worker_1 | [2021-10-31 03:58:41,962: INFO/ForkPoolWorker-12]
worker_1 |
worker_1 | [WDM] - ====== WebDriver manager ======
worker_1 | [2021-10-31 03:58:41,971: INFO/ForkPoolWorker-12] ====== WebDriver manager ======
worker_1 | [2021-10-31 03:58:42,112: WARNING/ForkPoolWorker-11] Error occurred while initializing chromedriver - Message: unknown error: unable to discover open window in chrome
worker_1 | (Session info: headless chrome=95.0.4638.69)
worker_1 | Stacktrace:
worker_1 | #0 0x004000a18f93 <unknown>
worker_1 | #1 0x0040004f3908 <unknown>
worker_1 | #2 0x0040004d3cdf <unknown>
worker_1 | #3 0x00400054cabe <unknown>
worker_1 | #4 0x004000546973 <unknown>
worker_1 | #5 0x00400051cdf4 <unknown>
worker_1 | #6 0x00400051dde5 <unknown>
worker_1 | #7 0x004000a482be <unknown>
worker_1 | #8 0x004000a5dba0 <unknown>
worker_1 | #9 0x004000a49215 <unknown>
worker_1 | #10 0x004000a5efe8 <unknown>
worker_1 | #11 0x004000a3d9db <unknown>
worker_1 | #12 0x004000a7a218 <unknown>
worker_1 | #13 0x004000a7a398 <unknown>
worker_1 | #14 0x004000a956cd <unknown>
worker_1 | #15 0x004002b29609 <unknown>
worker_1 |
worker_1 | [2021-10-31 03:58:42,113: WARNING/ForkPoolWorker-11]
worker_1 |
worker_1 | [2021-10-31 03:58:42,166: DEBUG/ForkPoolWorker-11] Using selector: EpollSelector
worker_1 | [WDM] -
worker_1 |
worker_1 | [2021-10-31 03:58:42,169: INFO/ForkPoolWorker-11]
worker_1 |
worker_1 | [WDM] - ====== WebDriver manager ======
worker_1 | [2021-10-31 03:58:42,170: INFO/ForkPoolWorker-11] ====== WebDriver manager ======
worker_1 | [2021-10-31 03:58:42,702: DEBUG/ForkPoolWorker-9] http://localhost:51793 "POST /session HTTP/1.1" 500 866
worker_1 | [2021-10-31 03:58:42,719: DEBUG/ForkPoolWorker-9] Finished Request
worker_1 | [2021-10-31 03:58:42,986: WARNING/ForkPoolWorker-9] Error occurred while initializing chromedriver - Message: unknown error: Chrome failed to start: crashed.
worker_1 | (chrome not reachable)
worker_1 | (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
worker_1 | Stacktrace:
worker_1 | #0 0x004000a18f93 <unknown>
worker_1 | #1 0x0040004f3908 <unknown>
worker_1 | #2 0x004000516b32 <unknown>
worker_1 | #3 0x00400051265d <unknown>
worker_1 | #4 0x00400054c770 <unknown>
worker_1 | #5 0x004000546973 <unknown>
worker_1 | #6 0x00400051cdf4 <unknown>
worker_1 | #7 0x00400051dde5 <unknown>
worker_1 | #8 0x004000a482be <unknown>
worker_1 | #9 0x004000a5dba0 <unknown>
worker_1 | #10 0x004000a49215 <unknown>
worker_1 | #11 0x004000a5efe8 <unknown>
worker_1 | #12 0x004000a3d9db <unknown>
worker_1 | #13 0x004000a7a218 <unknown>
worker_1 | #14 0x004000a7a398 <unknown>
worker_1 | #15 0x004000a956cd <unknown>
worker_1 | #16 0x004002b29609 <unknown>
worker_1 |
worker_1 | [2021-10-31 03:58:42,987: WARNING/ForkPoolWorker-9]
worker_1 |
worker_1 | [2021-10-31 03:58:43,045: DEBUG/ForkPoolWorker-9] Using selector: EpollSelector
worker_1 | [WDM] -
worker_1 |
worker_1 | [2021-10-31 03:58:43,049: INFO/ForkPoolWorker-9]
worker_1 |
worker_1 | [WDM] - ====== WebDriver manager ======
worker_1 | [2021-10-31 03:58:43,050: INFO/ForkPoolWorker-9] ====== WebDriver manager ======
worker_1 | [2021-10-31 03:58:43,936: DEBUG/ForkPoolWorker-10] http://localhost:43035 "POST /session HTTP/1.1" 500 866
worker_1 | [2021-10-31 03:58:43,952: DEBUG/ForkPoolWorker-10] Finished Request
worker_1 | [2021-10-31 03:58:44,163: WARNING/ForkPoolWorker-10] Error occurred while initializing chromedriver - Message: unknown error: Chrome failed to start: crashed.
worker_1 | (chrome not reachable)
worker_1 | (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
worker_1 | Stacktrace:
worker_1 | #0 0x004000a18f93 <unknown>
worker_1 | #1 0x0040004f3908 <unknown>
worker_1 | #2 0x004000516b32 <unknown>
worker_1 | #3 0x00400051265d <unknown>
worker_1 | #4 0x00400054c770 <unknown>
worker_1 | #5 0x004000546973 <unknown>
worker_1 | #6 0x00400051cdf4 <unknown>
worker_1 | #7 0x00400051dde5 <unknown>
worker_1 | #8 0x004000a482be <unknown>
worker_1 | #9 0x004000a5dba0 <unknown>
worker_1 | #10 0x004000a49215 <unknown>
worker_1 | #11 0x004000a5efe8 <unknown>
worker_1 | #12 0x004000a3d9db <unknown>
worker_1 | #13 0x004000a7a218 <unknown>
worker_1 | #14 0x004000a7a398 <unknown>
worker_1 | #15 0x004000a956cd <unknown>
worker_1 | #16 0x004002b29609 <unknown>
worker_1 |
worker_1 | [2021-10-31 03:58:44,164: WARNING/ForkPoolWorker-10]
worker_1 |
worker_1 | [2021-10-31 03:58:44,205: DEBUG/ForkPoolWorker-10] Using selector: EpollSelector
worker_1 | [WDM] -
worker_1 |
worker_1 | [2021-10-31 03:58:44,215: INFO/ForkPoolWorker-10]
worker_1 |
worker_1 | [WDM] - ====== WebDriver manager ======
worker_1 | [2021-10-31 03:58:44,217: INFO/ForkPoolWorker-10] ====== WebDriver manager ======
worker_1 | [WDM] - Current google-chrome version is 95.0.4638
worker_1 | [2021-10-31 03:58:44,520: INFO/ForkPoolWorker-12] Current google-chrome version is 95.0.4638
worker_1 | [WDM] - Get LATEST driver version for 95.0.4638
worker_1 | [2021-10-31 03:58:44,525: INFO/ForkPoolWorker-12] Get LATEST driver version for 95.0.4638
worker_1 | [WDM] - Current google-chrome version is 95.0.4638
worker_1 | [2021-10-31 03:58:44,590: INFO/ForkPoolWorker-11] Current google-chrome version is 95.0.4638
worker_1 | [WDM] - Get LATEST driver version for 95.0.4638
worker_1 | [2021-10-31 03:58:44,593: INFO/ForkPoolWorker-11] Get LATEST driver version for 95.0.4638
worker_1 | [2021-10-31 03:58:44,599: DEBUG/ForkPoolWorker-12] Starting new HTTPS connection (1): chromedriver.storage.googleapis.com:443
worker_1 | [2021-10-31 03:58:44,826: DEBUG/ForkPoolWorker-11] Starting new HTTPS connection (1): chromedriver.storage.googleapis.com:443
worker_1 | [2021-10-31 03:58:45,205: DEBUG/ForkPoolWorker-11] https://chromedriver.storage.googleapis.com:443 "GET /LATEST_RELEASE_95.0.4638 HTTP/1.1" 200 12
worker_1 | [2021-10-31 03:58:45,213: DEBUG/ForkPoolWorker-12] https://chromedriver.storage.googleapis.com:443 "GET /LATEST_RELEASE_95.0.4638 HTTP/1.1" 200 12
worker_1 | [WDM] - Driver [/home/ubuntu/.wdm/drivers/chromedriver/linux64/95.0.4638.54/chromedriver] found in cache
worker_1 | [2021-10-31 03:58:45,219: INFO/ForkPoolWorker-11] Driver [/home/ubuntu/.wdm/drivers/chromedriver/linux64/95.0.4638.54/chromedriver] found in cache
worker_1 | [WDM] - Driver [/home/ubuntu/.wdm/drivers/chromedriver/linux64/95.0.4638.54/chromedriver] found in cache
worker_1 | [2021-10-31 03:58:45,242: INFO/ForkPoolWorker-12] Driver [/home/ubuntu/.wdm/drivers/chromedriver/linux64/95.0.4638.54/chromedriver] found in cache
worker_1 | [WDM] - Current google-chrome version is 95.0.4638
worker_1 | [2021-10-31 03:58:45,603: INFO/ForkPoolWorker-9] Current google-chrome version is 95.0.4638
worker_1 | [WDM] - Get LATEST driver version for 95.0.4638
worker_1 | [2021-10-31 03:58:45,610: INFO/ForkPoolWorker-9] Get LATEST driver version for 95.0.4638
similar logs are looped.
when I tried to launch chrome on docker container, this error occurs.
ubuntu@742a62c61201:/backend$ google-chrome --no-sandbox --disable-dev-shm-usage --disable-gpu --remote-debugging-port=9222 --headless
qemu: uncaught target signal 5 (Trace/breakpoint trap) - core dumped
qemu: uncaught target signal 5 (Trace/breakpoint trap) - core dumped
[1031/041139.297323:ERROR:bus.cc(392)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
[1031/041139.310612:ERROR:file_path_watcher_linux.cc(326)] inotify_init() failed: Function not implemented (38)
DevTools listening on ws://127.0.0.1:9222/devtools/browser/32b15b93-3fe0-4cb8-9c96-8aea011686a8
qemu: unknown option 'type=utility'
[1031/041139.463057:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.463227:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 1 time(s)
[1031/041139.543335:ERROR:network_service_instance_impl.cc(638)] Network service crashed, restarting service.
qemu: unknown option 'type=utility'
[1031/041139.718793:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.718877:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 2 time(s)
[1031/041139.736641:ERROR:network_service_instance_impl.cc(638)] Network service crashed, restarting service.
qemu: unknown option 'type=utility'
[1031/041139.788529:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.788615:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 3 time(s)
[1031/041139.798487:ERROR:network_service_instance_impl.cc(638)] Network service crashed, restarting service.
[1031/041139.808256:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.808372:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 4 time(s)
qemu: unknown option 'type=utility'
[1031/041139.825267:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.825354:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 5 time(s)
[1031/041139.830175:ERROR:network_service_instance_impl.cc(638)] Network service crashed, restarting service.
[1031/041139.839159:ERROR:gpu_process_host.cc(973)] GPU process launch failed: error_code=1002
[1031/041139.839345:WARNING:gpu_process_host.cc(1292)] The GPU process has crashed 6 time(s)
[1031/041139.839816:FATAL:gpu_data_manager_impl_private.cc(417)] GPU process isn't usable. Goodbye.
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
ubuntu@742a62c61201:/backend$ qemu: unknown option 'type=utility'
ubuntu@742a62c61201:/backend$
Maybe this issue related? https://github.com/docker/for-mac/issues/5766
If so, there's no way to dockerize headless chrome using m1?
celery worker Dockerfile
FROM --platform=linux/amd64 ubuntu:20.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt update -y && apt install python3.9 python3-pip python-is-python3 sudo wget -y
RUN pip install --upgrade pip
# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
RUN adduser --disabled-password --gecos '' ubuntu
RUN adduser ubuntu sudo
RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
USER ubuntu
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
RUN echo "deb http://dl.google.com/linux/chrome/deb/ stable main" | sudo tee /etc/apt/sources.list.d/google.list
RUN sudo apt update -y && sudo apt install -y google-chrome-stable
ENV PATH="/home/ubuntu/.local/bin:$PATH"
WORKDIR /backend
COPY requirements.txt ./
RUN pip install -r requirements.txt --no-cache-dir
COPY . .
ENV DISPLAY=:99
ENTRYPOINT [ "./run-celery.sh" ]
docker-compose.yml
version: "3.3"
services:
frontend:
build:
context: ./frontend
ports:
- "3000:3000"
volumes:
- ./frontend:/frontend
depends_on:
- backend
deploy:
resources:
limits:
cpus: "2"
memory: 4G
reservations:
cpus: "0.5"
memory: 512M
tty: true
stdin_open: true
backend:
build: ./backend
ports:
- "8000:8000"
volumes:
- ./backend:/backend
networks:
- redis-network
depends_on:
- redis
- worker
environment:
- is_docker=1
deploy:
resources:
limits:
cpus: "2"
memory: 4G
reservations:
cpus: "0.5"
memory: 512M
tty: true
worker:
build:
context: ./backend
dockerfile: ./celery-dockerfile/Dockerfile
deploy:
resources:
limits:
cpus: "2"
memory: 4G
reservations:
cpus: "0.5"
memory: 4G
volumes:
- ./backend:/backend
networks:
- redis-network
depends_on:
- redis
environment:
- is_docker=1
privileged: true
tty: true
platform: linux/amd64
redis:
image: redis:alpine
command: redis-server --port 6379
container_name: redis_server
hostname: redis_server
labels:
- "name=redis"
- "mode=standalone"
networks:
- redis-network
expose:
- "6379"
tty: true
networks:
redis-network:
Crawler full code from AutoCrawler repository. if you want to full crawler code, it's better checkout this code.
I've changed options during trial and error.
chrome_options = Options()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument("--remote-debugging-port=9222")
CodePudding user response:
I think that there's no way to use chrome/chromium on m1 docker.
- no binary for chrome arm64 linux
- when running chrome on amd64 container with m1 host crashes - docker docs
- chromium could be installed using snap, but snap service not running on docker (without snap, having 127 error because binary from apt is empty) - issue report
I tried
Chromium supports arm ubuntu; I tried using chromium instead of chrome.
But chromedriver officially does not support arm64; I used unofficial binary on electron release. https://stackoverflow.com/a/57586200/11853111
Bypassing
Finally, I've decided to use gechodriver and firefox while using docker.
It seamlessly works regardless of host/container architecture.
CodePudding user response:
Found an answer using arm64 container!
I installed chromium from debian package server as mentioned at https://askubuntu.com/questions/1204571/how-to-install-chromium-without-snap (especially the way from https://www.inx.one/blog/debian-repo-on-ubuntu)
Dockerfile:
FROM --platform=arm64 ubuntu:20.04
ENV DEBIAN_FRONTEND noninteractive
RUN apt update -y && apt install python3.9 python3-pip python-is-python3 libgl1-mesa-glx axel sudo gdebi-core -y
RUN pip install --upgrade pip
# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
WORKDIR /backend
COPY requirements.txt ./
RUN pip install -r requirements.txt --no-cache-dir
RUN umask 22 && \
echo 'Package: *\nPin: release a=eoan\nPin-Priority: 500\n\nPackage: *\nPin: origin "ftp.debian.org"\nPin-Priority: 300\n\nPackage: chromium*\nPin: origin "ftp.debian.org"\nPin-Priority: 700\n\nPackage: libwebpmux3\nPin: origin "*.debian.org"\nPin-Priority: 700' \
> /etc/apt/preferences.d/chromium.pref && \
echo 'deb http://deb.debian.org/debian buster main\ndeb http://deb.debian.org/debian buster-updates main\ndeb http://deb.debian.org/debian-security buster/updates main\n' \
> /etc/apt/sources.list.d/debian.list && \
echo 'deb [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian stable main\ndeb-src [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian stable main\n\ndeb [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian-security/ stable-security main\ndeb-src [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian-security/ stable-security main\n\ndeb [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian stable-updates main\ndeb-src [signed-by=/usr/share/keyrings/debian-archive-keyring.gpg] http://deb.debian.org/debian stable-updates main\n' \
> /etc/apt/sources.list.d/debian-stable.list
RUN sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys DCC9EFBF77E11517 && \
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 648ACFD622F3D138 && \
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys AA8E81B4331F7F50 && \
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 112695A0E562B32A
RUN apt install -y debian-archive-keyring && \
apt update -y && \
apt install chromium-sandbox chromium chromium-driver -y
COPY . .
ENTRYPOINT [ "./run-celery.sh" ]
Another solution. just use debian.
Dockerfile:
FROM --platform=arm64 python:3.9
# actually python image is debian based
ENV DEBIAN_FRONTEND noninteractive
RUN pip install --upgrade pip
# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
WORKDIR /backend
COPY requirements.txt ./
RUN pip install -r requirements.txt --no-cache-dir
COPY . .
RUN apt update -y && apt install libgl1-mesa-glx sudo chromium chromium-driver -y
ENTRYPOINT [ "./run-celery.sh" ]