Home > Mobile >  How can you extract the elements of a nested array of a JSON using Java Spark
How can you extract the elements of a nested array of a JSON using Java Spark

Time:08-07

Here is my JSON file content.

{
  "Id": 11,
  "data": [
    {
      "package": "com.browser1",
      "activetime": 60000,
      "steps": [
        {"x":  1, "y":  2},
        {"x": 11, "y": 12}
      ]
    },
    {
      "package": "com.browser6",
      "activetime": 1205000,
      "steps": [
        {"x":  3, "y":  4}
      ]
    },
    {
      "package": "com.browser7",
      "activetime": 1205000,
      "steps": [
        {"x":  5, "y":  6}
      ]
    }
  ]
}

I am reading this json file from spark using java. How can I get the value of the following:-

json.data[0].steps[0].x
and 
json.data[0].steps[1].x

I am using DataSet.select to do this.

df.select(json.data[0].steps[1].x)

did not work.

P.S:- Need java solution and not Scala

CodePudding user response:

If you read the JSON file like this:

sparkSession.read.option("multiline", true).json("./data.json")

Then you can access your desired values as below:

.withColumn("test", col("data").getItem(0).getField("steps").getItem(1).getField("x"))

or

.withColumn("test2", expr("data[0].steps[1].x"))

They do the same thing, whichever you prefer, the return value is 11.

Good luck!

  • Related