Home > other >  Spark RDD problem cannot be nested, bosses to solve it
Spark RDD problem cannot be nested, bosses to solve it

Time:09-17

1. The spark inside, because of the closure RDD in trasaction operation process can reference other RDD, watch a lot of articles explaining tuned in to all five senses, not good, understanding is not good,
2. RDD in trasaction process, can't call sparkContext object, because sparkContext can only be performed on the driver side? The trasaction operations have a closure on the executor end?

CodePudding user response:

Your understanding is right, the nested needs to use the join to do, if really must be nested, only small RDD collect and radio, and in the transform operator access nested broadcast variable ways

CodePudding user response:

RDD in trasaction operation process can reference other RDD, have what good explanation?
  • Related