Home > Blockchain >  Priority error when pyspark save mode 'overwrite' in docker container
Priority error when pyspark save mode 'overwrite' in docker container

Time:09-30

I am saving csv files as stream with pyspark. When I saving files, I am using output mode is 'overwrite' and there is not any problem. But when I want to containerize my spark app is giving an error. I add code and the error below:

df.write.format("csv").mode("overwrite").save("/app/files")

java.io.IOException: Unable to clear output directory file:/app/files prior to writing to it

I think the error is due to permissions. So I tried USER root in dockerfile but the error not fixed.

CodePudding user response:

how about

df.write.format("csv").mode("overwrite").save("file:///app/files")

CodePudding user response:

What is the running mode of the spark program, Local or Standalone or YARN or... Check whether the spark program in docker is running as the root user, for example:

ps -ef | grep spark_app_name
  • Related