I have created a table in Databricks
create table TabA (latitude float, longitude float, col1 string,col2 string)
utils.executequery( """ update TabA set col1 = ST_Envelope(col2)""" ) I tried converting this output as string but getting error as _tostring() not supported
utils.executequery(""" optimize TabA """)
utils.executequery( """ update TabA set latitude = col1.Lat""" )
utils.executequery(""" optimize TabA """)
utils.executequery( """ update TabA set longitude= col1.Long""" )
utils.executequery(""" optimize TabA """)
I am getting the error
col1#22613: need struct type but got string
I tried casting the "col1" as string, but I was not able to solve this exception. How do I solve it?
CodePudding user response:
Your error must be coming from col1.Lat
and col1.Long
. Since your col1
is string, you cannot use dot .
notation such as col1.Lat
, because this notation is for struct data type, not for string.
Consider this example:
df = spark.createDataFrame([('x', (1.0, 2.0))], 'string_col:string, struct_col:struct<lat:double,lon:double>')
df.createOrReplaceTempView('TabA')
df.printSchema()
# root
# |-- string_col: string (nullable = true)
# |-- struct_col: struct (nullable = true)
# | |-- lat: double (nullable = true)
# | |-- lon: double (nullable = true)
df.show()
# ---------- ----------
# |string_col|struct_col|
# ---------- ----------
# | x|{1.0, 2.0}|
# ---------- ----------
The following SQL query works, because I address the struct type column and successfully extract the field lon
:
spark.sql('select struct_col.lon from TabA').show()
# ---
# |lon|
# ---
# |2.0|
# ---
But the following SQL query fails, because I try to do the same on the string tyoe column.
spark.sql('select string_col.lon from TabA').show()
AnalysisException: Can't extract value from string_col#617: need struct type but got string; line 1 pos 7