spark.sql("""select get_json_object('{"k":{"value":"abc"}}', '$.*.value') as j""").show()
This results in null while it should return 'abc'. It works if I replace *
with k
.
I'm aware of limited JSONPath support (https://cwiki.apache.org/confluence/display/Hive/LanguageManual UDF#LanguageManualUDF-get_json_object)
but is there a way to achieve this is Spark SQL.
CodePudding user response:
There is a Spark JIRA "Any depth search not working in get_json_object ($..foo)" open for full JsonPath support.
Until it is resolved, I'm afraid creating a UDF that uses a "general-purpose" JsonPath implementation might be the one and only option:
> spark-shell --jars "json-path-2.7.0.jar"
:
scala> import com.jayway.jsonpath.JsonPath
import com.jayway.jsonpath.JsonPath
scala> val r = JsonPath.read[net.minidev.json.JSONArray]("""
| {"k":{"value":"abc"},
| "m":{"value":"def"}}
| ""","$.*.value")
r: net.minidev.json.JSONArray = ["abc","def"]
scala>