Home > Back-end >  Is it OK to force the hosting of the applicationMaster on one same node (YARN)?
Is it OK to force the hosting of the applicationMaster on one same node (YARN)?

Time:11-16

I am submitting Spark applications to my Hadoop 3 nodes cluster. In my spark-defaults.conf file i set

spark.yarn.appMasterEnv.SPARK_LOCAL_IP 127.0.0.1
spark.yarn.appMasterEnv.SPARK_MASTER_HOST 0.0.0.0

So that the applicationMaster is always (client or cluster mode) hosted on the client machine. Is it OK to do that ?

To note that if i don't do that and Yarn attempts to host the applicationMaster on a slave node then binding error stops the running.

Thanks for clarifying this.

CodePudding user response:

No, It's not "OK".

One of the ideologies behind spark is resilience. If you are forcing 1 node to be the application master you are introducing a bottleneck & a single point of failure. You are using yarn, there is no reason to specify a master.

CodePudding user response:

If this is just for you and this works, go for it.

You aren't following a "Normal" spark strategy for a yarn cluster. Is that 'OK'? If you have a good reason, yes it's ok.

Would I use this in production? No.

Are there simpler more common ways of running a cluster? Yes.

You are mixing strategies of running Spark Standalone and Yarn. These are two fundamentally different architectures. If you can make the two architectures work together that's fun. But you may hit some weird problems and as this is a custom set of settings you may not find a lot of support to help you.

  • Related