Based on pyspark spark program written, submitted in standone mode can run normally, but under the mode of yarn by always quote memory error more than container physical memory limit, causes the container automatically be kill off (two modes of resource allocation is the same), the analysis found pyspark program is running, the executor of the Java process occupies executor - memory size of the memory by default, will cause an additional process of python runtime container out of memory error (actually python process is running, a Java process in the wait state, so the standone mode will not quote out of memory error),
Excuse me each great god, the spark yarn mode, set the executor - whether the memory value and container memory size is one-to-one correspondence relationship? How to effectively solve the problem? (have tried to adjust the executor - the size of the memory, still appear afore-mentioned problems, which piece do you need the cluster set?)