I'm very confused right now. Please check if this is right.
4 cases command like below:
# It mean, yarn is cluster mode and deploy cluster mode.
# cluster have YARN Container(have Spark AM, Spark Driver) and YARN node manager.
spark-submit --master yarn --deploy-mode cluster
# It mean, yarn is cluster mode and deploy client mode.
# client have Spark Driver.
# cluster have YARN Container(have Spark AM, Spark Driver) and YARN node manager.
spark-submit --master yarn --deploy-mode client
# It mean, yarn is client mode and deploy cluster mode.
# cluster have YARN Container(have Spark AM) and YARN node manager.
spark-submit --master yarn-client --deploy-mode cluster
# It mean, yarn is client mode and deploy client mode.
# client have Spark Driver.
# cluster have YARN Container(have Spark AM) and YARN node manager.
spark-submit --master yarn-client --deploy-mode client
Is the explanation of the above code correct?
CodePudding user response:
#Use yarn, deploy the driver into the yarn cluster.
spark-submit --master yarn --deploy-mode cluster
#Use yarn, deploy the driver on my local machine(machine that is launching the code)
spark-submit --master yarn --deploy-mode client # this is the default if you don't specify --deploy-mode
These aren't actual options anymore so they aren't really worth discussing:
spark-submit --master yarn-client --deploy-mode cluster
spark-submit --master yarn-client --deploy-mode client
--master yarn-client
maybe was an option in early version of spark but isn't used today. (as referenced in the documentation above)