import spark.implicits._
import org.apache.spark.sql.column
def reverseMap(colName:Column) = map_from_arrays(map_values(colName),map_keys(colName))
val testDF = Seq(("cat",Map("black"->3,"brown"->5,"white"->1)), ("dog",Map("cream"->6,"black"->5,"white"->2)))
.toDF("animal","ageMap")
testDF.show(false)
val testDF1 = testDF.withColumn("keySort",map_from_entries(array_sort(map_entries(col("ageMap")))))
This code runs fine in spark >3 . I want to run spark<3 .
CodePudding user response:
Welcome to Stackoverflow!
From your comment I gather that your code was working in v3.2.2 and not in v2.4.5.
Your issue is that map_entries
does not exist in Spark v2.4.5. You can get the same functionality by extracting the keys and values separately using map_keys
and map_values
, and then using array_zip
to combine them.
The first bit is exactly the same:
import spark.implicits._
import org.apache.spark.sql.Column
def reverseMap(colName:Column) = map_from_arrays(map_values(colName),map_keys(colName))
val testDF = Seq(("cat",Map("black"->3,"brown"->5,"white"->1)), ("dog",Map("cream"->6,"black"->5,"white"->2))).toDF("animal","ageMap")
testDF.show(false)
------ ------------------------------------
|animal|ageMap |
------ ------------------------------------
|cat |[black -> 3, brown -> 5, white -> 1]|
|dog |[cream -> 6, black -> 5, white -> 2]|
------ ------------------------------------
And the difference is in how you define testDF1
val testDF1 = testDF
.withColumn("keys", map_keys(col("ageMap")))
.withColumn("values", map_values(col("ageMap")))
.withColumn("keySort", map_from_entries(array_sort(arrays_zip(col("keys"), col("values")))))
.select("animal", "ageMap", "keySort")
testDF1.show(false)
------ ------------------------------------ ------------------------------------
|animal|ageMap |keySort |
------ ------------------------------------ ------------------------------------
|cat |[black -> 3, brown -> 5, white -> 1]|[black -> 3, brown -> 5, white -> 1]|
|dog |[cream -> 6, black -> 5, white -> 2]|[black -> 5, cream -> 6, white -> 2]|
------ ------------------------------------ ------------------------------------
This code ran successfully on a v2.4.5 spark-shell.