Could you tell me how to achieve the following logic:
For ()... Traverse df1
{
For ()... Traverse df2
{
Fun (df1.1 df1.2, df2.1 df2.2)... Fun function, parameters are the two columns of two df
}
}
CodePudding user response:
Clear tell you can't.Df is the underlying RDD, RDD is distributed, cannot be nested, will be out of the question.
Can you say that you want to achieve what function? Must use a nested query implementation? Df provide API can't meet?
CodePudding user response:
Would you like to do is join? Direct df1 df2 cartesian product, line filter want your fun function after executionCodePudding user response: