I have to transform the code below into spark, But I dont understand what exactly Seq perform in this code ?
val tempFactDF = unionTempDF.join(fact.select("x","y","d","f","s"),
Seq("x","y","d","f")).dropDuplicates
CodePudding user response:
Here it is performing a join operation over multiple columns and it is defined as Seq("x","y","d","f")
.
It is equivalent to:
val joiningTable = fact.select("x","y","d","f","s")
unionTempDF.join(joiningTable, unionTempDF("x") === joiningTable("x") &&
unionTempDF("y") === joiningTable("y") &&
unionTempDF("d") === joiningTable("d") &&
unionTempDF("f") === joiningTable("f"))