Home > Enterprise >  Extract value from array in Spark
Extract value from array in Spark

Time:05-04

I am trying to extract a value from an array in SparkSQL, but getting the error below:

Example column

customer_details  
{"original_customer_id":"ch_382820","first_customer_id":"ch_343948"}

I am using this code:

get_json_object(customer_details, '$.original_customer_id') as customer_id

But I am getting the following error:

error: invalid string interpolation $., expected: $$, $identifier or ${expression}
       spark.sql(s"""

error: unclosed character literal (or use " not ' for string literal)
           get_json_object(customer_details, '$.original_customer_id') as customer_id,

CodePudding user response:

To me the following worked:

val df = Seq("{'original_customer_id':'ch_382820','first_customer_id':'ch_343948'}").toDF("customer_details")
df.show(truncate=false)
//  -------------------------------------------------------------------- 
// |customer_details                                                    |
//  -------------------------------------------------------------------- 
// |{'original_customer_id':'ch_382820','first_customer_id':'ch_343948'}|
//  -------------------------------------------------------------------- 

df.selectExpr("get_json_object(customer_details, '$.original_customer_id') as customer_id").show()
//  ----------- 
// |customer_id|
//  ----------- 
// |  ch_382820|
//  ----------- 

As requested, this is Spark SQL version:

select get_json_object(customer_details, '$.original_customer_id') as customer_id
from df
df.createOrReplaceTempView("df")

spark.sql(
    """
    select get_json_object(customer_details, '$.original_customer_id') as customer_id
    from df
    """
).show()
//  ----------- 
// |customer_id|
//  ----------- 
// |  ch_382820|
//  ----------- 
  • Related