A, b, c
D, e, f
G, h, I
Now I want to add a column on the
1, a, b, c
2, d, e, f
3, g, h, I
Dataframe or RDD form can be
Could you tell me how to achieve bosses?
CodePudding user response:
There are two ways,Is a global device (for example, they have the Sequence type node, or get a dialing services constantly produce increasing value as hair), but the efficiency is low,
Two is mapPartition, get the current partition number of partitions, and then a partition number x + current partition coefficient increasing local values, article coefficient is one of the largest data partition number + a certain redundancy,
The most convenient option is the former, the fastest but easy to a problem is the latter,
CodePudding user response:
Additional memory is the most convenient option, but easy to blasting repartition for a partition, there is only one partition increment is global, large amount of data will be OOMCodePudding user response:
I also need to solve similar problems, please ask the landlord to solve no