Home > Back-end >  In PySpark, how do I read a specific JSON attribute that has been loaded to a dataframe?
In PySpark, how do I read a specific JSON attribute that has been loaded to a dataframe?

Time:02-16

I am trying to get the value of "__delta" from the following JSON schema that has been loaded to a dataframe. How do I do that in Pyspark?

root
 |-- d: struct (nullable = true)
 |    |-- __delta: string (nullable = true)
 |    |-- __next: string (nullable = true)
 |    |-- results: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- ABRVW: string (nullable = true)
 |    |    |    |-- ADRNR: string (nullable = true)
 |    |    |    |-- ANRED: string (nullable = true)

CodePudding user response:

with the struct type JSON object just select the object with the attribute you want to get.

df.select("d.__delta")

CodePudding user response:

How about df.select($"d.__delta")

  • Related