I have a dataframe:
location <- c("a", "b", "c", "d", "e", "e")
type <- c("city", "city", "town", "town", "village", "village")
code <- c("123", "112", "83749", "83465", "38484757", "3838891")
country <- c("zz", "zz", "zz", "zz", "zz", "zz")
df <- data.frame(location, type, code, country)
I want to group by location and convert to dictionary Something like below:
{location:[[type], [code], [country]]}
I know this should be quite straight forward using python, but I am not sure how to do that using R. I have tried below using unclass, but still didn't get what i am expecting:
unclass(by(df, df$location, function(x) {
tmp <- x$code
setNames(tmp, x$location[1])
tmp
})) -> location_mapping
Expected Output:
{
'a':[['city'],['123'],['zz']],
'b':[['city'],['112'],['zz']],
'c':[['town'],['83749'],['zz']],
'd':[['town'],['83465'],['zz']],
'e':[['village'],['38484757','3838891'],['zz']]
}
CodePudding user response:
#--- EDITED
From your updated question, something like this might be what you want. R doesn't do curly braces like python does. Still, for the purpose of feeding further functions, the code below does what you want:
library(dplyr)
location <- c("a", "b", "c", "d", "e", "e")
type <- c("city", "city", "town", "town", "village", "village")
code <- c("123", "112", "83749", "83465", "38484757", "3838891")
country <- c("zz", "zz", "zz", "zz", "zz", "zz")
df <- data.frame(location, type, code, country)
df %>%
dplyr::group_by(location) %>%
summarise(code=list(code), across()) %>% # makes list of multiple `code` entries / `across()` keeps cols
filter(!duplicated(location)) %>% # filtering duplicate locations
.[,c(1,3,2,4] # arranging cols
# A tibble: 5 × 4
# Groups: location [5]
location type code country
<chr> <chr> <list> <chr>
1 a city <chr [1]> zz
2 b city <chr [1]> zz
3 c town <chr [1]> zz
4 d town <chr [1]> zz
5 e village <chr [2]> zz
CodePudding user response:
You can summarise each group of location
with unique()
across multiple columns.
library(dplyr)
dict <- df %>%
group_by(location) %>%
summarise(across(, ~ list(unique(.x))))
dict
# # A tibble: 5 × 4
# location type code country
# <chr> <list> <list> <list>
# 1 a <chr [1]> <chr [1]> <chr [1]>
# 2 b <chr [1]> <chr [1]> <chr [1]>
# 3 c <chr [1]> <chr [1]> <chr [1]>
# 4 d <chr [1]> <chr [1]> <chr [1]>
# 5 e <chr [1]> <chr [2]> <chr [1]>
After converting it to JSON
, you can get the expected structure.
split(dict[-1], dict$location) %>%
jsonlite::toJSON(dataframe = "values", pretty = TRUE, auto_unbox = TRUE)
# {
# "a": [["city", "123", "zz"]],
# "b": [["city", "112", "zz"]],
# "c": [["town", "83749", "zz"]],
# "d": [["town", "83465", "zz"]],
# "e": [["village", ["38484757", "3838891"], "zz"]]
# }