I would like to merge 2 spark dataframes (scala). The first data frame contains only 1 row. The second dataframe has multiple rows. I would like to merge these and copy the address / phone column values in the first dataframe to all the rows in second dataframe. Is there a way do it using Spark operations?
DF1
name age address phone
ABC 25 XYZ 00000
DF2
name age
Bill 30
Steve 40
Jackie 50
Final DF
name age address phone
ABC 25 XYZ 00000
Bill 30 XYZ 00000
Steve 40 XYZ 00000
Jackie 50 XYZ 00000
CodePudding user response:
There is a simple way to do it:
import org.apache.spark.sql.functions.lit
val row = df1.select("address", "phone").collect()(0)
val finalDF = df2.withColumn("address", lit(row(0)))
.withColumn("phone", lit(row(1))).union(df1)