Home > other >  Spark how to parse and built-in json
Spark how to parse and built-in json

Time:09-25

Spark how to parse and built-in json
For example:
{
"_index" : "nginxacc - 2016.09.30,"
"_type" : "logs",
"_id" : "AVd6G5gNfVF4aGz4f2fE,"
"_version" : 1,
"_score" : 1,
"_source" : {
"@ timestamp" : "the 2016-09-30 T00:00:09. 000 z,"
"Clientip" : "42.122.1.97,"
"Status" : "200",
"@ version" : "1",
"Geoip" : {
"IP" : "42.122.1.97,"
"Country_code2" : "CN",
"Country_code3" : "CHN",
"Country_name" : "China,"
"Continent_code" : "AS,"
"Region_name" : "28,"
"City_name" : "Tianjin",
"Latitude" : 39.1422,
"Longitude" : 117.17669999999998,
Timezone: "Asia/Shanghai,"
"Real_region_name" : "Tianjin",
"Location" : [
117.17669999999998
,
39.1422
]
}
}
}

Unable to get the contents of geoip in this field to spark dataframe,
The great god, please help

CodePudding user response:

PrintSchema see embedded field corresponding to what is the type of json, and then use anonymous udf to deal with
  • Related