I have this dataframe:
df <- data.frame(
option_label_1 = c("thickness", "strength", "color"),
option_value_1 = c("0.5 in", "2 lb" , "red"),
option_label_2 = c("size", "color", "thickness"),
option_value_2 = c("0.5 Inches x 7200 Feet", "blue" , "1 in"),
option_label_3 = c("stretch", NA, NA),
option_value_3 = c("wide", NA , NA)
)
option_label_1 option_value_1 option_label_2 option_value_2 option_label_3 option_value_3
1 thickness 0.5 in size 0.5 Inches x 7200 Feet stretch wide
2 strength 2 lb color blue <NA> <NA>
3 color red thickness 1 in <NA> <NA>
I want this data frame:
option_label_1 option_value_1 option_label_2 option_value_2 option_label_3 option_value_3
1 thickness 0.5 in size 0.5 Inches x 7200 Feet stretch wide
2 strength 2 lb color blue <NA> <NA>
3 color red thickness 1 in <NA> <NA>
json
1 {"thickness":"0.5 in","size":"0.5 Inches x 7200 Feet","stretch":"wide"}
2 {"strength":"2 lb","color":"blue"}
3 {"color":"red","thickness":"1 in"}
Essentially I want a JSON column added to the original df built off of the original columns using the option labels and option values. Please note I do not want a solution that converts the whole dataframe to JSON using toJSON. I have a much larger dataframe with other fields I do not want in JSON. I just want the option_labels and their respective option_values to be in JSON.
I have tried using list and paste functions nested in toJSON, but the "option_labels" are static and don't change accordingly in the resulting JSON column.
Thanks for your help!
CodePudding user response:
Here's an option using dplyr
and tidyr
-
library(dplyr)
library(tidyr)
#Add a row number column to keep track of each row
#Useful for joining afterwards.
df1 <- df %>% mutate(rownum = row_number())
df1 <- df1 %>%
#Get the data in long format so that we have
#option_label and option_value as 2 separate columns
#Drop NA values.
pivot_longer(cols = -rownum,
names_to = '.value',
names_pattern = '(option_\\w )_',
values_drop_na = TRUE) %>%
#Create a string with the pattern "column_name" : "column_value"
mutate(json = sprintf('"%s" : "%s"', option_label, option_value)) %>%
#for each row
group_by(rownum) %>%
#Combine the json value in a comma separated string.
#Also add "{..}" surrounding them.
summarise(json = sprintf('{%s}', toString(json))) %>%
#Join to get original dataframe back with a new column
inner_join(df1, by = 'rownum')
#View the output
cat(df1$json, sep = "\n")
#{"thickness" : "0.5 in", "size" : "0.5 Inches x 7200 Feet", "stretch" : "wide"}
#{"strength" : "2 lb", "color" : "blue"}
#{"color" : "red", "thickness" : "1 in"}
CodePudding user response:
You could also do:
df %>%
rownames_to_column('rn') %>%
pivot_longer(-rn, '.value', names_pattern = '(.*)_', values_drop_na = TRUE) %>%
group_by(rn) %>%
summarise(json = jsonlite::toJSON(data.table::transpose(cur_data(),make.names = TRUE)))
# A tibble: 3 x 2
rn json
<chr> <json>
1 1 [{"thickness":"0.5 in","size":"0.5 Inches x 7200 Feet","stretch":"wide"}]
2 2 [{"strength":"2 lb","color":"blue"}]
3 3 [{"color":"red","thickness":"1 in"}]
CodePudding user response:
This solution ultimately worked for me. Thanks to everyone who contributed.
df_1 <- df %>%
rowwise() %>%
dplyr::mutate(
options_json = ifelse(!is.na(option_value_3),
paste(toJSON(setNames(list(option_label_1 = paste(option_value_1), option_label_2 = paste(option_value_2), option_label_3 = paste(option_value_3)), c(option_label_1, option_label_2, option_label_3)), auto_unbox = T)),
ifelse(!is.na(option_value_2),
paste(toJSON(setNames(list(option_label_1 = paste(option_value_1), option_label_2 = paste(option_value_2)), c(option_label_1, option_label_2)), auto_unbox = T)),
paste(toJSON(setNames(list(option_label_1 = paste(option_value_1)), option_label_1), auto_unbox = T))
)
)
)