Home > Mobile >  Convert RDD of Matrix to RDD of Vector
Convert RDD of Matrix to RDD of Vector

Time:10-13

I have a RDD[Matrix[Double]] and want to convert it to RDD[Vector] (Each row in the Matrix will be converted to a Vector).

I've seen related answer like Convert Matrix to RowMatrix in Apache Spark using Scala, but it's one Matrix to RDD of Vector. While my case is RDD of Matrix.

CodePudding user response:

Use flatMap on code to convert Matrix to Seq[Vector]:

// from https://stackoverflow.com/a/28172826/1206998
def toSeqOfVector(m: Matrix): Seq[Vector] = {
  val columns = m.toArray.grouped(m.numRows)
  val rows = columns.toSeq.transpose // Skip this if you want a column-major RDD.
  rows.map(row => new DenseVector(row.toArray))
}

RDD[Matrix] matrices = ??? // your input
RDD[Vector] vectors = matrices.flatMap(toSeqOfVector)

Note: I didn't test this code, but this is the principle

  • Related