Home > other >  Spark new people for help,
Spark new people for help,

Time:10-02

Directly under the environment of the idea run Kmeans case, an error, a great god, please help solve thank you,

Kmeans_data. TXT data as shown below:

0.0 0.0 0.0
0.1 0.1 0.1
0.2 0.2 0.2
9.0 9.0 9.0
9.1 9.1 9.1
9.2 9.2 9.2
Code:
The import org. Apache. Log4j. {Level, Logger}
The import org. Apache. Spark. {SparkConf, SparkContext}
The import org. Apache. Spark. Mllib. Clustering. KMeans
The import org. Apache. Spark. Mllib. Linalg. Vectors

The object Kmeans {
Def main (args: Array [String]) {
//shielding unnecessary log shows on the terminal
Logger. GetLogger (org. Apache. "spark"). The setLevel (Level. WARN)
Logger. GetLogger (" org. Eclipse. Jetty. Server "). The setLevel (Level. OFF)

//set the running environment
Val conf=new SparkConf (.) setAppName (" kmeans "). SetMaster (" spark://Sparkmaster: 7077 ")
Val sc=new SparkContext (conf)

//load the data set
Val data=https://bbs.csdn.net/topics/sc.textFile ("/usr/local/hadoop/upload/kmeans_data. TXT ", 1)
Val parsedData=https://bbs.csdn.net/topics/data.map (s=> Vectors. The dense (s.s plit (' '). The map (_. ToDouble)))

//the data gathering, 2 class, 20 iterations, model training form data model
Val numClusters=2
Val numIterations=20
Val model=KMeans. "train" (parsedData numClusters, numIterations)

//print the center of the data model
Println (" Cluster centers ")
For (c & lt; - model. ClusterCenters) {
Println (" "+ c.t oString)
}

//using the sum of error square to evaluate data model
Val cost=model.com puteCost (parsedData)
Println (" Within the Set Sum of Squared Errors="+ cost)

//using model test single point data
Println (" Vectors 0.2 0.2 0.2 is belongs to clusters: "+ model, predict (Vectors. The dense (" 0.2 0.2 0.2". The split (' '). The map (_. ToDouble))))
Println (" Vectors 0.25 0.25 0.25 is belongs to clusters: "+ model, predict (Vectors. The dense (" 0.25 0.25 0.25". The split (' '). The map (_. ToDouble))))
Println (" Vectors 8 8 8 is belongs to clusters: "+ model, predict (Vectors. The dense (" 8 8 8". The split (' '). The map (_. ToDouble))))

//cross assessment 1, return only the results
Val testdata=https://bbs.csdn.net/topics/data.map (s=> Vectors. The dense (s.s plit (' '). The map (_. ToDouble)))
Val result1=model predict (testdata)
Result1. SaveAsTextFile ("/usr/local/hadoop/upload/result_kmeans1 ")

2//cross assessment, return data sets and the results
Val result2=data. The map {
The line=& gt;
Val linevectore=Vectors. Dense (line. The split (' '). The map (_. ToDouble))
Val prediction=model predict (linevectore)
The line + "" + prediction
}. SaveAsTextFile ("/usr/local/hadoop/upload/result_kmeans2 ")

Sc. Stop ()
}
}