I have a JSON file that I want to convert to a table. It is easy to do with the jsonlite
library. However, if the file is big, then the conversion takes a significant amount of time. So I am testing a tidyjson
with the hope to speed up the process.
My JSON file looks as follows:
x = '[
{
"id": 1,
"A": [
{
"B": "b1",
"C": [
"c1"
]
}
]
},
{
"id": 2,
"A": [
{
"B": "b1",
"C": [
"c2"
]
}
]
}
]
'
That's how I process it:
library(tidyjson)
library(dplyr)
x %>% gather_array() %>%
spread_values(id = jstring("id")) %>%
enter_object("A") %>% gather_array %>%
spread_values(B = jstring("B")) %>%
enter_object("C") %>% gather_array() %>%
spread_values(C = jstring("C")) %>%
select(id, B, C)
Outcome I get:
..JSON id B C
<chr> <chr> <chr> <chr>
1 "\"c1\"" 1 b1 NA
2 "\"c2\"" 2 b1 NA
Cannot figure out what is wrong with the code and why it doesn’t work well for C. Any help is much appreciated.
UPDATE: Expected output:
id B C
<chr> <chr> <chr>
1 1 b1 c1
2 2 b1 c2
UPDATE 2:
jsonlite
way:
y = jsonlite::fromJSON(x)
cbind(id = y$id, do.call(rbind.data.frame, y$A))
id B C
1 1 b1 c1
2 2 b1 c2
Not sure that it is the fastest way of using jsonlite
in this case.
CodePudding user response:
We may use fromJSON
from jsonlite
library(jsonlite)
library(tidyr)
library(dplyr)
fromJSON(x) %>%
unnest_wider(A) %>%
unnest(C) %>%
unnest(C)
-output
# A tibble: 2 x 3
id B C
<int> <chr> <chr>
1 1 b1 c1
2 2 b1 c2
Or another option is
library(reticulate)
library(rrapply)
py_run_string(paste0("x = ", x))
rrapply(py$x, how = 'bind')
A.1.C A.1.B id
1 c1 b1 1
2 c2 b1 2