Home > Blockchain >  PySpark: add a row from one dataframe as a column with constant value to another dataframe
PySpark: add a row from one dataframe as a column with constant value to another dataframe

Time:06-10

I have two Spark dataframes:

>df1
 --------------- 
|         values|
 --------------- 
|[a, b, c, d, ..|
 --------------- 

>df2
 --- --------- 
| id|   number|
 --- --------- 
|  1|    34523|
|  2|    56438|
|  5|    90342|
 --- --------- 

How can I add column values from df1 as constant value to each row in df2?
Expected output:

 --- --------- --------------- 
| id|   number|         values|
 --- --------- --------------- 
|  1|    34523|[a, b, c, d, ..|
|  2|    56438|[a, b, c, d, ..|
|  5|    90342|[a, b, c, d, ..|
 --- --------- --------------- 

CodePudding user response:

Depends, if its only one row, may as well just cross join. Please remember, this can be quite expensive if multiple rows are involved

df2.crossJoin(df1.select("values")).show()
  • Related