As I discovered Spark SQL does not have hashing functions. In order to select specific hashed data I need to use custom/UDF function like this
sparkSession.udf.register("hashFuncWithSecret", (s: String) => myHashFunction(s, "my_very_secret_key"))
I want "my_very_secret_key" not be exposed and visible from other spark session or elsewhere in UI or other. I wonder if this is possible. Thank you!
CodePudding user response:
It will not be visible from other Spark session. Farthest you can get is:
scala> spark.catalog.listFunctions.show(false)
----- -------- ----------- ----------------------------------------------------- -----------
|name |database|description|className |isTemporary|
----- -------- ----------- ----------------------------------------------------- -----------
|! |null |null |org.apache.spark.sql.catalyst.expressions.Not |true |
|% |null |null |org.apache.spark.sql.catalyst.expressions.Remainder |true |
|& |null |null |org.apache.spark.sql.catalyst.expressions.BitwiseAnd |true |
|* |null |null |org.apache.spark.sql.catalyst.expressions.Multiply |true |
| |null |null |org.apache.spark.sql.catalyst.expressions.Add |true |
|- |null |null |org.apache.spark.sql.catalyst.expressions.Subtract |true |
...
This is not going to display the definition though.