then execute: , the output is always 1, speechless, for me for a long time, has been crazy, I hope the great god help me!!!!!! Here is my HDFS file:
, look from the final figure, word number so much, how also not a
CodePudding user response:
Hello, you such a statistic is not the number of words, but the number of RDD, you need to do this: val words=readmeFile. FlatMap (_. The split (" "))Val wordCounts=words. The map (x=& gt; 1), (x) reduceByKey + _) (_
WordCounts. Print ()
This is the statistics the number of words,
Can join 366436387 spark technology exchange group, mutual exchange of learning,
CodePudding user response:
As shown in figure, calculate the number of rows, not the wordsCodePudding user response:
You this statement is to read the file, the file is only one lineTextFile as the default is the default to enter a newline break up, so the output value of 1
Val words=readmeFile. FlatMap (_. The split (" "))
Val wordCounts=words. The map (x=& gt; (1), x). ReduceByKey (+ _ _)