So I'm trying to clean some data in R, Im not very experinced with R. What I would like to do is Get a certain peice of Data from a JSON and store it into a Varaible The JSON is very large and messy, I tried using JSON Lite and using the $ Variable selector but couldnt find a way to grab the specific peice of Data I needed, Theres very little information and resource out there on JSON Manipulation in R
JSON File :
{
"error": false,
"msg": "all countries and population 1961 - 2018",
"data": [
{
"country": "Arab World",
"code": "ARB",
"populationCounts": [
{
"year": 1960,
"value": 92197753
},
{
"year": 1961,
"value": 94724510
},
{
"year": 1962,
"value": 97334442
}
]
}
]
}
So lets say I want to gather the 1962 Value into a Variable, what would be the best way I do that
Any suggestion would be a huge help
CodePudding user response:
You are on the right lines with jsonlite
. The data structure that you get in R depends on the JSON structure itself, of course, so you need to spend some time exploring the object in R to understand how you can manipulate it.
Assuming you have a file like your example data, named mydata.json
in the current path, you can read it like this:
library(jsonlite)
library(tidyr) # comes in later
j <- fromJSON("mydata.json")
What's in j
?
names(j)
[1] "error" "msg" "data"
How about data
?
str(j$data)
'data.frame': 1 obs. of 3 variables:
$ country : chr "Arab World"
$ code : chr "ARB"
$ populationCounts:List of 1
..$ :'data.frame': 3 obs. of 2 variables:
.. ..$ year : int 1960 1961 1962
.. ..$ value: int 92197753 94724510 97334442
So j$data
is a dataframe with 3 columns. The last one, with the years and values is a list column. So we can get into that using tidyr::unnest()
:
j$data %>%
unnest(cols = c(populationCounts))
Result:
# A tibble: 3 × 4
country code year value
<chr> <chr> <int> <int>
1 Arab World ARB 1960 92197753
2 Arab World ARB 1961 94724510
3 Arab World ARB 1962 97334442