My approach is: RDD. Collect (converted to a form of Array, and then operate)
CodePudding user response:
To map operation of RDD, calls to every line of the split segmentation and then take the specified columnCodePudding user response:
Try the Spark SQL, the structured data file to import for the DataFrame, then do the same as database operation file data, including the filter, group, aggCodePudding user response:
RDD. FlatMap ()