Home > other >  I want to have a spark of data within a dataframe compared with another dataframe size, then screen
I want to have a spark of data within a dataframe compared with another dataframe size, then screen

Time:09-18

I want to have a spark of data within a dataframe compared with another dataframe size, then screen out as a result, could you tell me what function should I use? O great god answers, the feeling is very simple

CodePudding user response:

Is the first article n and df1 df2 article n compare? The same amount and df1 df2 data

CodePudding user response:

If I understand correctly, so that we can do, but need to ensure that the number of fragmentation within two df and the number of data in each shard,
 f1. RDD. Zip (df2. RDD). The map {case Tuple2 (row1, row2)=& gt; 
Val data1=row1. Get int (0)
Val data2=row1. Get int (0)
Data1 & gt; Data2
}
Filter (_)
  • Related