Home > other >  The following implementation logic with dataframe?
The following implementation logic with dataframe?

Time:09-23

Two dataframe and df1, respectively, df2, each have two columns

Could you tell me how to achieve the following logic:

For ()... Traverse df1
{
For ()... Traverse df2
{
Fun (df1.1 df1.2, df2.1 df2.2)... Fun function, parameters are the two columns of two df
}
}

CodePudding user response:

Clear tell you can't.
Df is the underlying RDD, RDD is distributed, cannot be nested, will be out of the question.
Can you say that you want to achieve what function? Must use a nested query implementation? Df provide API can't meet?

CodePudding user response:

Would you like to do is join? Direct df1 df2 cartesian product, line filter want your fun function after execution

CodePudding user response:

refer to the second floor link0007 response:
would you like to do is join? A cartesian product directly df1 df2 do, filter line want to perform your fun function after


 
Df1. RegisterTempTable (df1 "");
Df2 df2. RegisterTempTable (" ");

DataFrame res=sqlContext. SQL (" select udf (df1.1 df1.2, df2.1, df2.2) as udf_result from df1 join df2 on 1=1 ");//fully cartesian product is 1=1
Res. The show;
  • Related