Home > Software engineering >  Spark Dataframe show not generating a DAG
Spark Dataframe show not generating a DAG

Time:03-21

I am trying to generate a dataframe by using toDF function like this enter image description here

When I see the Spark UI , after running the df.show action , I don't see any DAG , why is this happening?

CodePudding user response:

Because it is in memory with no parallelization called; there is a Spark optimization that can do it immediately where Seq is used to create a dataframe.

The same via this:

val df = sc.parallelize(1 until 5).toDF("a")

does produce Job / DAG as workers, distribution is involved.

  • Related