Home > Mobile >  Creating individual dataframes from a grouped dataframe
Creating individual dataframes from a grouped dataframe

Time:01-02

I have a dataframe that needs to be split into individual files based on the value of a variable in the dataframe. There are scores of individuals and confidential information in the dataframe, thus a simplified example is below. I want the split to be based on the variable "first".

first <-  c("Jon", "Bill", "Bill" , "Maria", "Ben", "Tina")
age <- c(23, 41, 41 , 32, 58, 26)

df <- data.frame(first , age)
df

For example, I want the file with Jon to have one line and the file with Bill to have two lines. I've attempted the following but I'm stuck. I don't know how to get individual dataframes from the list df.split.

library(tidyverse)
df.grped <- 
  df %>%
  group_by(first)

df.split <- 
  group_split(df.grped)

So I would like to have the files: df.split_Jon, df.split_Bill, df.split_Maria, etc. The actual source file is large so I don't want to specify each.

Since I understand working in tidyverse the best I'd like to have the solution there, if possible. Thanks for any help!!

CodePudding user response:

After splitting the data set by the unique values of the first column, we make use of list2env function to create a separated dataframe of each subset into the global environment as follows:

library(tidyverse)

setNames(df %>% 
           group_split(first), paste0("df.split_", unique(df$first))) %>%
  list2env(envir = globalenv())

CodePudding user response:

Another alternative:

library(tidyverse)

df %>%
  group_split(first) %>%
  walk(~ assign(str_c("df.split_", .[1, 1]), value = ., envir = .GlobalEnv))

names(.GlobalEnv)
#> [1] "df.split_Bill"  "first"          "df.split_Maria" "df.split_Ben"  
#> [5] "df.split_Tina"  "age"            "df.split_Jon"   "df"

Created on 2022-01-01 by the reprex package (v2.0.1)

  • Related