Home > other >  The spark - steaming for kafka doubts (on yarn)
The spark - steaming for kafka doubts (on yarn)

Time:09-23

We use kafka streaming computing common architecture is kafka - spark - steaming (on yarn) - DB
When we submit the spark task, general configuration (based on pyspark) something like this:
Spark - submit - class org. Apache. Spark. Examples. XXXX \
- master yarn \
- num - executors 4 \
- driver - 2 g memory \
- executor - 3 g memory \
- the executor - cores 4 \
.
My question is when to start the job take a topic, consumption is launching a drive is responsible for receiving data and then sent to four executors to perform map& Reduce? Or four executors or concurrent kafka different patition to get the data, and then respectively in the execution map& Reduce??

thank you
  • Related