I use map function on dataframe rdd like this :
val df2=df1.select("vScaled")
val sqldf = df2.rdd
.map { row => row.getAs[DenseVector]("vScaled").values }
.map(array => Nd4j.createFromArray(Array(array)))
it works but i want to change values in df2 I mean i want to replace values which below to zero and bigger than 1 to 0 and 1
So how can I change map function to replace map function to looking up condition.
I try to use if statement but it fails.
my code :
val sqldf = df2.rdd
.map {
row =>
if (row.getAs[DenseVector]("vScaled").values > double(1) )
{row.getAs[DenseVector]("vScaled").values = double(1).toArray}
else if (row.getAs[DenseVector]("vScaled").values< double(0) )
{ row.getAs[DenseVector]("vScaled").values= double(0).toArray }
else {row.getAs[DenseVector]("vScaled").values}
}
.map(array => Nd4j.createFromArray(Array(array)))
code is not valid
How can i handle that?
thanks in advance
CodePudding user response:
thanks to @Tim
it works with map again with math.max
and math.min
val sqldf = df2.rdd
.map {
row => row.getAs[DenseVector]("vScaled").values.map(x => math.max(0.0, math.min(1.0, x)))
}
.map(array => Nd4j.createFromArray(Array(array)))