Home > database >  Convert columns of R dataframe to JSON
Convert columns of R dataframe to JSON

Time:02-10

I have this dataframe:

df <- data.frame(
  option_label_1 = c("thickness", "strength", "color"),
  option_value_1 = c("0.5 in", "2 lb" , "red"),
  option_label_2 = c("size", "color", "thickness"),
  option_value_2 = c("0.5 Inches x 7200 Feet", "blue" , "1 in"),
  option_label_3 = c("stretch", NA, NA),
  option_value_3 = c("wide", NA , NA)
  )
  option_label_1 option_value_1 option_label_2         option_value_2 option_label_3 option_value_3
1      thickness         0.5 in           size 0.5 Inches x 7200 Feet        stretch           wide
2       strength           2 lb          color                   blue           <NA>           <NA>
3          color            red      thickness                   1 in           <NA>           <NA>

I want this data frame:

  option_label_1 option_value_1 option_label_2         option_value_2 option_label_3 option_value_3
1      thickness         0.5 in           size 0.5 Inches x 7200 Feet        stretch           wide
2       strength           2 lb          color                   blue           <NA>           <NA>
3          color            red      thickness                   1 in           <NA>           <NA>
                                                                       json
1 {"thickness":"0.5 in","size":"0.5 Inches x 7200 Feet","stretch":"wide"}
2                                       {"strength":"2 lb","color":"blue"}
3                                        {"color":"red","thickness":"1 in"}

Essentially I want a JSON column added to the original df built off of the original columns using the option labels and option values. Please note I do not want a solution that converts the whole dataframe to JSON using toJSON. I have a much larger dataframe with other fields I do not want in JSON. I just want the option_labels and their respective option_values to be in JSON.

I have tried using list and paste functions nested in toJSON, but the "option_labels" are static and don't change accordingly in the resulting JSON column.

Thanks for your help!

CodePudding user response:

Here's an option using dplyr and tidyr -

library(dplyr)
library(tidyr)

#Add a row number column to keep track of each row
#Useful for joining afterwards. 
df1 <- df %>% mutate(rownum = row_number())

df1 <- df1 %>%
  #Get the data in long format so that we have 
  #option_label and option_value as 2 separate columns
  #Drop NA values.
  pivot_longer(cols = -rownum,
               names_to = '.value', 
               names_pattern = '(option_\\w )_',
               values_drop_na = TRUE) %>%
  #Create a string with the pattern "column_name" : "column_value"
  mutate(json = sprintf('"%s" : "%s"', option_label, option_value)) %>%
  #for each row
  group_by(rownum) %>%
  #Combine the json value in a comma separated string. 
  #Also add "{..}" surrounding them.
  summarise(json = sprintf('{%s}', toString(json))) %>%
  #Join to get original dataframe back with a new column
  inner_join(df1, by = 'rownum')

#View the output
cat(df1$json, sep = "\n")

#{"thickness" : "0.5 in", "size" : "0.5 Inches x 7200 Feet", "stretch" : "wide"}
#{"strength" : "2 lb", "color" : "blue"}
#{"color" : "red", "thickness" : "1 in"}

CodePudding user response:

You could also do:

df %>%
  rownames_to_column('rn') %>%
  pivot_longer(-rn, '.value', names_pattern = '(.*)_', values_drop_na = TRUE) %>%
  group_by(rn) %>%
  summarise(json = jsonlite::toJSON(data.table::transpose(cur_data(),make.names = TRUE)))

# A tibble: 3 x 2
  rn    json                                                                     
  <chr> <json>                                                                   
1 1     [{"thickness":"0.5 in","size":"0.5 Inches x 7200 Feet","stretch":"wide"}]
2 2     [{"strength":"2 lb","color":"blue"}]                                     
3 3     [{"color":"red","thickness":"1 in"}]    

CodePudding user response:

This solution ultimately worked for me. Thanks to everyone who contributed.

df_1 <- df %>%
  rowwise() %>%
  dplyr::mutate(
    options_json = ifelse(!is.na(option_value_3),
      paste(toJSON(setNames(list(option_label_1 = paste(option_value_1), option_label_2 = paste(option_value_2), option_label_3 = paste(option_value_3)), c(option_label_1, option_label_2, option_label_3)), auto_unbox = T)),
      ifelse(!is.na(option_value_2),
        paste(toJSON(setNames(list(option_label_1 = paste(option_value_1), option_label_2 = paste(option_value_2)), c(option_label_1, option_label_2)), auto_unbox = T)),
        paste(toJSON(setNames(list(option_label_1 = paste(option_value_1)), option_label_1), auto_unbox = T))
      )
    )
  )
  • Related