Home > Software engineering >  Convert Dataframe to List of List (python equivalent of dictionary) in R
Convert Dataframe to List of List (python equivalent of dictionary) in R

Time:01-13

I have a dataframe:

location <- c("a", "b", "c", "d", "e", "e")
type <- c("city", "city", "town", "town", "village", "village")
code <- c("123", "112", "83749", "83465", "38484757", "3838891")
country <- c("zz", "zz", "zz", "zz", "zz", "zz")
df <- data.frame(location, type, code, country)

I want to group by location and convert to dictionary Something like below:

{location:[[type], [code], [country]]}

I know this should be quite straight forward using python, but I am not sure how to do that using R. I have tried below using unclass, but still didn't get what i am expecting:

unclass(by(df, df$location, function(x) {
  tmp <- x$code
  setNames(tmp, x$location[1])
  tmp
})) -> location_mapping

Expected Output:

{
'a':[['city'],['123'],['zz']],
'b':[['city'],['112'],['zz']],
'c':[['town'],['83749'],['zz']],
'd':[['town'],['83465'],['zz']],
'e':[['village'],['38484757','3838891'],['zz']]
}

CodePudding user response:

#--- EDITED

From your updated question, something like this might be what you want. R doesn't do curly braces like python does. Still, for the purpose of feeding further functions, the code below does what you want:

library(dplyr)

location <- c("a", "b", "c", "d", "e", "e")
type <- c("city", "city", "town", "town", "village", "village")
code <- c("123", "112", "83749", "83465", "38484757", "3838891")
country <- c("zz", "zz", "zz", "zz", "zz", "zz")
df <- data.frame(location, type, code, country)

df %>% 
  dplyr::group_by(location) %>% 
  summarise(code=list(code), across()) %>% # makes list of multiple `code` entries / `across()` keeps cols
  filter(!duplicated(location)) %>% # filtering duplicate locations
  .[,c(1,3,2,4] # arranging cols

# A tibble: 5 × 4
# Groups:   location [5]
  location type    code      country
  <chr>    <chr>   <list>    <chr>  
1 a        city    <chr [1]> zz     
2 b        city    <chr [1]> zz     
3 c        town    <chr [1]> zz     
4 d        town    <chr [1]> zz     
5 e        village <chr [2]> zz    

CodePudding user response:

You can summarise each group of location with unique() across multiple columns.

library(dplyr)

dict <- df %>%
  group_by(location) %>%
  summarise(across(, ~ list(unique(.x))))

dict
# # A tibble: 5 × 4
#   location type      code      country
#   <chr>    <list>    <list>    <list>
# 1 a        <chr [1]> <chr [1]> <chr [1]>
# 2 b        <chr [1]> <chr [1]> <chr [1]>
# 3 c        <chr [1]> <chr [1]> <chr [1]>
# 4 d        <chr [1]> <chr [1]> <chr [1]>
# 5 e        <chr [1]> <chr [2]> <chr [1]>

After converting it to JSON, you can get the expected structure.

split(dict[-1], dict$location) %>%
  jsonlite::toJSON(dataframe = "values", pretty = TRUE, auto_unbox = TRUE)

# {
#   "a": [["city", "123", "zz"]],
#   "b": [["city", "112", "zz"]],
#   "c": [["town", "83749", "zz"]],
#   "d": [["town", "83465", "zz"]],
#   "e": [["village", ["38484757", "3838891"], "zz"]]
# }
  • Related