Home > database >  how to use a variable as spark selected fields
how to use a variable as spark selected fields

Time:03-11

I'm fresh with scala, there's a dataframe with lots of columns, I would like to select some fields but have to list them all every time as below, how can I define a variable stands for them and pass in scala?

df.select("a", "b", "c", "d", "e", "f") 

expected:

df.select(variable val) 

CodePudding user response:

You can pass a list of columns, something like this:

import org.apache.spark.sql.functions.col

val fields = List("a", "b", "c", "d").map(col)
df.select(fields: _*)

map(col) transforms your list of strings into columns. fields: _* transforms your List into multiple arguments

  • Related