I have a string variable type in my data frame that has a long string (its a JSON response) with both the names of the columns I want and the values following it.
My data frame looks like this:
- each row is a participant
- Participant column is the list of each participants
- Responses has a string entry with a JSON response, where I would like the beginning of the entry to be the variable and what is after the ":" to be the value.
Participant | Responses |
---|---|
Emily | {"participantAge":"40","participantEducation":"Bachelors"} |
Doug | {"participantAge":"35","participantEducation":"Bachelors"} |
So for instance, the goal is to have a participantAge column with the values as the entries and participantEducation as a column with the entries
Participant | Responses | participantAge | participantEducation |
---|---|---|---|
Emily | {"} | 40 | Bachelors |
Doug | {"} | 35 | Bachelors |
Ive been able to do this before with python by converting the JSON response to dictionaries, but im not sure how to implement this in R.
CodePudding user response:
You can do this as follows using dplyr
and jsonlite
library(dplyr)
library(jsonlite)
df %>%
rowwise() %>%
mutate(Response = list(parse_json(Response))) %>%
unnest_wider(Response)
Output:
Participant participantAge participantEducation
<chr> <chr> <chr>
1 Emily 35 Bachelors
2 Doug 40 Bachelors
Input:
df = structure(list(Participant = c("Emily", "Doug"), Response = c("{\"participantAge\":\"35\",\"participantEducation\":\"Bachelors\"}",
"{\"participantAge\":\"40\",\"participantEducation\":\"Bachelors\"}"
)), class = "data.frame", row.names = c(NA, -2L))
CodePudding user response:
You can try the jsonlite
package:
library("jsonlite")
dat_df <- data.frame(Emily='{"participantAge":"40","participantEducation":"Bachelors"}',
Doug='{"participantAge":"35","participantEducation":"Bachelors"}')
fromJSON_rec <- apply(dat_df, 2, fromJSON)
new_df <- data.frame(matrix(NA, nrow=2, ncol=3))
colnames(new_df) <- c("Participant", "participantAge", "participantEducation")
for(i in 1:length(fromJSON_rec)){
new_df[i,] <- c(names(fromJSON_rec)[i],
fromJSON_rec[[names(fromJSON_rec)[i]]][["participantAge"]],
fromJSON_rec[[names(fromJSON_rec)[i]]][["participantEducation"]])
}
> dat_df
Emily Doug
1 {"participantAge":"40","participantEducation":"Bachelors"} {"participantAge":"35","participantEducation":"Bachelors"}
> new_df
Participant participantAge participantEducation
1 Emily 40 Bachelors
2 Doug 35 Bachelors