Such as:
A
B C
D, E, F, G
The main output A
Server a1 output D B E
Server a2 output C F G
Could you tell me how to do this? Use hadoop can? Trouble explain in detail, thank you,
CodePudding user response:
Hadoop parallel processing is now the data is split, and then shuffle the merge combine, but the distribution of task and not the procedure control, the output of the results is focused, because of the cluster nodes to Mr Program is blocked