Home > other >  Spark output calculation rowmatrix matrix storage is not normal, and guidance
Spark output calculation rowmatrix matrix storage is not normal, and guidance

Time:09-27


On the cluster computing will encounter A problem, when the matrix singular value decomposition "M.com puteSVD (5000, true, 1.0 e-9 d)", the decomposition of A=U * s * V s vector storage, V matrix storage is normal, but the left singular matrix decomposition U defaults to rowmatrix matrix, the matrix storage (storing code: U.r ows. SaveAsTextFile (" HDFS://s2:9000/outsvd/big_UUT1 ")), eg the results stored for several parts: part - 00000,

Part - 00001, part - 00002... Etc., so think of some way to get the output for a part (storage code: U.r ows, repartition (1) saveAsTextFile (" HDFS://s2:9000/outsvd/big_UUT1 "))

But found that I calculate the matrix of U in the right order, such as in principle is a part - 00001 + + part part - 00000-00002 order, but the results in accordance with part - 00000 + + part part - 00001-00002 order,

Try to output found many times U matrix is uncertain because the results of several output summary part part of the problem, so has been don't know what to do, or want to understand one can give advice and guidance

Part of the test code is as follows:
The text data sets: matrix_A1. TXT
1,0,0,1,0,0,0,0,0
1,0,1,0,0,0,0,0,0
1,1,0,0,0,0,0,0,0
0,1,1,0,1,0,0,0,0
0,1,1,2,0,0,0,0,0
0,1,0,0,1,0,0,0,0
0,1,0,0,1,0,0,0,0
0,0,1,1,0,0,0,0,0
0,1,0,0,0,0,0,0,1
0,0,0,0,0,1,1,1,0
0,0,0,0,0,0,1,1,1
0,0,0,0,0,0,0,1,1

Scala code
Import the Java. Util. {Date, Locale}
Import the Java. Text. The DateFormat
The import org. Apache. Spark. Mllib. Linalg. {Vector, Vectors}
The import org. Apache. Spark. Mllib. Linalg. Vector
The import org. Apache. Spark. Mllib. Linalg. Distributed. RowMatrix
The import org. Apache. Spark. Mllib. Linalg. _

Val now1=new Date
Val M=new RowMatrix (sc) textFile (" HDFS:///usr/matrix/matrix_A1. TXT "). The map (_. The split (', '))
. The map (_. The map (_. ToDouble))
. The map (_. ToArray). The map (line=& gt; Vectors. Dense (line)))
Val SVD=M.com puteSVD (4, true, 1.0 e-9 d)
Val last1=new Date
Val U=SVD. U
U.r ows. Foreach (println)
U.r ows. Repartition (1). SaveAsTextFile (" HDFS://1111.11.11.111:9000/outsvd/UU1 ")

CodePudding user response:

Please understand and guide, the detailed code changes, the younger brother thank humbly

CodePudding user response:

Don't sink ah

CodePudding user response:

  • Related