My Spark job fails with the following error:
java.lang.IllegalArgumentException: Required executor memory (33792 MB), offHeap memory (0) MB, overhead (8192 MB), and PySpark memory (0 MB)
is above the max threshold (24576 MB) of this cluster!
Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
I have defined executor memory to be 33g and executor memory overhead to be 8g. However, the total should be less than or equal to 24g as per the error log. Can someone help me understand what exactly does 24g refer to? Is it the RAM on the master node or something else? Why is it capped to 24g? Once I figure it out, I can programmatically calculate my other values to not run into this issue again.
Setup: Running make command which houses multiple spark-submit commands on Jenkins which launches it on an AWS EMR cluster running Spark 3.x
CodePudding user response:
This error is happening because you're requesting more resources than is available on the cluster (org.apache.spark.deploy.yarn.Client
source). For your case specifically (AWS EMR), I think you should check the value of yarn.nodemanager.resource.memory-mb
as message says (in yarn-site.xml
or via NodeManager Web UI), and do not try to allocate more than this value per YARN container memory.