Home > OS >  I can't allocate a dataframe to this variable - Scala
I can't allocate a dataframe to this variable - Scala

Time:11-05

I'm trying to rearrange the columns of a data frame in spark scala with this code


def performTransformations(commonArgs: Map[String, Any], dataDf: Dataset[Row]): Dataset[Row] = {  
   // Create local var as a copy of data 
   var data = dataDf  
   *all the transformations here*       
   val data2 = data.select(reorderedColNames: _*)      
   data = data2   
}

the reorderdColNames is an array that has all the columns in the order I want.

But I am getting this error

error: type mismatch;
[ERROR]  found   : Unit
[ERROR]  required: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]

How can I manage this? Thanks

I have tried to arrange the columns with other methods but I wasn't able.

CodePudding user response:

According to your comment, the problem is that your function is defined to return Dataset[Row] but in fact it returns Unit as the last statement of the function is a variable assignment (data = data2).

Change your function to:

def performTransformations(commonArgs: Map[String, Any], dataDf: Dataset[Row]): Dataset[Row] = {  
    // Create local var as a copy of data     
    var data = dataDf  
    // all the transformations here    
    data.select(reorderedColNames: _*)   
} 
  • Related