def nameFilter= (name:String):String =>{
if(name.startsWith("A")) name
}
val register= udf(nameFilter)
concatDf.select(register(col("Region"))).show()
})
concatDf.select(ageFilter(col("Region"))).show()
I have concatDf where I have a column called Region and I want The Region whose name starts with "A" should be present.
O I have written the following udf. but the exception says
<console>:6: error: illegal start of declaration
name.startsWith("A") name
CodePudding user response:
If you say
whose name starts with "A" should be present
this means that you want to use filter
and not select
. If you want, for example, in this table:
--- -------
|id |name |
--- -------
|1 |Albania|
|2 |Germany|
--- -------
to only keep Albania
(since it starts with A), you might want to change your logic.
To do this without UDFs (which is the best way), you can simply do:
df.filter(col("name").startsWith("A"))
But if you want to use a UDF, you still have to use filter
, therefore you have to modify your function nameFilter
to return a boolean
instead of a string
, like:
def nameFilter = (name: String) => name.startsWith("A")
Once you do that, you can use:
val register = udf(nameFilter)
To register the function, then finally:
df.filter(register(col("name")))
To actually filter the rows. Hope this is what you need, good luck!