Home > Blockchain >  Convert a string entry in a variable into multiple variables R
Convert a string entry in a variable into multiple variables R

Time:05-15

I have a string variable type in my data frame that has a long string (its a JSON response) with both the names of the columns I want and the values following it.

My data frame looks like this:

  • each row is a participant
  • Participant column is the list of each participants
  • Responses has a string entry with a JSON response, where I would like the beginning of the entry to be the variable and what is after the ":" to be the value.
Participant Responses
Emily {"participantAge":"40","participantEducation":"Bachelors"}
Doug {"participantAge":"35","participantEducation":"Bachelors"}

So for instance, the goal is to have a participantAge column with the values as the entries and participantEducation as a column with the entries

Participant Responses participantAge participantEducation
Emily {"} 40 Bachelors
Doug {"} 35 Bachelors

Ive been able to do this before with python by converting the JSON response to dictionaries, but im not sure how to implement this in R.

CodePudding user response:

You can do this as follows using dplyr and jsonlite


library(dplyr)
library(jsonlite)

df %>% 
  rowwise() %>%
  mutate(Response = list(parse_json(Response))) %>%
  unnest_wider(Response)

Output:

  Participant participantAge participantEducation
  <chr>       <chr>          <chr>               
1 Emily       35             Bachelors           
2 Doug        40             Bachelors 

Input:

df = structure(list(Participant = c("Emily", "Doug"), Response = c("{\"participantAge\":\"35\",\"participantEducation\":\"Bachelors\"}", 
"{\"participantAge\":\"40\",\"participantEducation\":\"Bachelors\"}"
)), class = "data.frame", row.names = c(NA, -2L))

CodePudding user response:

You can try the jsonlite package:

library("jsonlite")

dat_df <- data.frame(Emily='{"participantAge":"40","participantEducation":"Bachelors"}',
                     Doug='{"participantAge":"35","participantEducation":"Bachelors"}')

fromJSON_rec <- apply(dat_df, 2, fromJSON)

new_df <- data.frame(matrix(NA, nrow=2, ncol=3))
colnames(new_df) <- c("Participant",    "participantAge",   "participantEducation")

for(i in 1:length(fromJSON_rec)){
  new_df[i,] <- c(names(fromJSON_rec)[i],
                  fromJSON_rec[[names(fromJSON_rec)[i]]][["participantAge"]],
                  fromJSON_rec[[names(fromJSON_rec)[i]]][["participantEducation"]])
}
> dat_df
                                                       Emily                                                       Doug
1 {"participantAge":"40","participantEducation":"Bachelors"} {"participantAge":"35","participantEducation":"Bachelors"}
> new_df
  Participant participantAge participantEducation
1       Emily             40            Bachelors
2        Doug             35            Bachelors
  • Related