I'm trying to rearrange the columns of a data frame in spark scala with this code
def performTransformations(commonArgs: Map[String, Any], dataDf: Dataset[Row]): Dataset[Row] = {
// Create local var as a copy of data
var data = dataDf
*all the transformations here*
val data2 = data.select(reorderedColNames: _*)
data = data2
}
the reorderdColNames is an array that has all the columns in the order I want.
But I am getting this error
error: type mismatch;
[ERROR] found : Unit
[ERROR] required: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
How can I manage this? Thanks
I have tried to arrange the columns with other methods but I wasn't able.
CodePudding user response:
According to your comment, the problem is that your function is defined to return Dataset[Row]
but in fact it returns Unit
as the last statement of the function is a variable assignment (data = data2
).
Change your function to:
def performTransformations(commonArgs: Map[String, Any], dataDf: Dataset[Row]): Dataset[Row] = {
// Create local var as a copy of data
var data = dataDf
// all the transformations here
data.select(reorderedColNames: _*)
}