I am trying to extract a value from an array in SparkSQL, but getting the error below:
Example column
customer_details
{"original_customer_id":"ch_382820","first_customer_id":"ch_343948"}
I am using this code:
get_json_object(customer_details, '$.original_customer_id') as customer_id
But I am getting the following error:
error: invalid string interpolation $., expected: $$, $identifier or ${expression}
spark.sql(s"""
error: unclosed character literal (or use " not ' for string literal)
get_json_object(customer_details, '$.original_customer_id') as customer_id,
CodePudding user response:
To me the following worked:
val df = Seq("{'original_customer_id':'ch_382820','first_customer_id':'ch_343948'}").toDF("customer_details")
df.show(truncate=false)
// --------------------------------------------------------------------
// |customer_details |
// --------------------------------------------------------------------
// |{'original_customer_id':'ch_382820','first_customer_id':'ch_343948'}|
// --------------------------------------------------------------------
df.selectExpr("get_json_object(customer_details, '$.original_customer_id') as customer_id").show()
// -----------
// |customer_id|
// -----------
// | ch_382820|
// -----------
As requested, this is Spark SQL version:
select get_json_object(customer_details, '$.original_customer_id') as customer_id
from df
df.createOrReplaceTempView("df")
spark.sql(
"""
select get_json_object(customer_details, '$.original_customer_id') as customer_id
from df
"""
).show()
// -----------
// |customer_id|
// -----------
// | ch_382820|
// -----------