I downloaded apache spark docker image from here.
I then found out that
SPARK_NO_DAEMONIZE
should be set toTRUE
- which I did by 'bash'ing into Docker using the following commanddocker run -ti --name spark apache/spark:v3.3.0 bash export SPARK_NO_DAEMONIZE=true
I then tried starting spark - '/opt/spark/sbin/start-master.sh' and then got the error
sh-5.1$ /opt/spark/sbin/start-master.sh mkdir: cannot create directory ‘/opt/spark/logs’: Permission denied chown: cannot access '/opt/spark/logs': No such file or directory starting org.apache.spark.deploy.master.Master, logging to /opt/spark/logs/spark--org.apache.spark.deploy.master.Master-1- aaea1d8bfe7c.out Spark Command: /usr/local/openjdk-11/bin/java -cp /opt/spark/conf:/opt/spark/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host aaea1d8bfe7c --port 7077 -- webui-port 8080
I understand from the 'Dockerfile' that user '185' runs everything in there. Unfortunately, I don't yet understand how to enable root user in there so that I can change the permissions or creating log directory.
Could someone please suggest whether I am missing something?
p.s. I don't wish to run spark using docker-compose.yml
I want to run a single cluster
CodePudding user response:
You can override the user that runs inside the container:
docker run -ti --user 0 --name spark apache/spark:v3.3.0 bash
You are then root inside the container.
If you want to do it through a Dockerfile instead, these are the steps:
- Create a Dockerfile as:
FROM apache/spark:v3.3.0
USER root
RUN mkdir -p /opt/spark/logs && chmod a wr /opt/spark/logs
USER 185
ENV SPARK_NO_DAEMONIZE=true
CMD ["/opt/spark/sbin/start-master.sh"]
- Build the image
docker build -t "testspark" .
- Run the container
docker run -ti --rm testspark