Home > Mobile >  Split many dataframes by a column, and save as different dataframes
Split many dataframes by a column, and save as different dataframes

Time:02-13

I have many dataframes. I would like to split them based on the values in a column (a factor). Then I would like to store the result of the split in separate data frame that have a specific name.

For the sake of a mrp, consider some generated data,

for (i in 1:10) {
assign(paste("df_",i,sep = ""), data.frame(x = rep(1,12), y  = c(rep("a",4),rep("b",4),rep("c",4))))
}

here we have 10 dfs, df_1, df_2... to df_10. (real data is similar to generated data, but in real data column z is different for each df).

Now, I want to split the dfs by 'y' (column 2).

For 1 df, I can do the following;

splitdf <- split(df_1,df_1$y)
namessplit <- c("a","b","c")
for (i in 1:length(splitdf)) {
  assign(paste("df_1_",namessplit[[i]],sep = ""),splitdf[[i]])
}

While this works for 1 df, how can I do it for all the dfs?

Big thanks in advance!

CodePudding user response:

It is not recommended to create multiple objects in the global env, but if we want to know how to create the objects from a nested list - Loop over the outer list sequence and then in the inner list sequence, paste the corresponding names to assign the extracted inner list element

lst1 <- lapply(mget(ls(pattern = "^df_\\d $")), \(x) split(x, x$y))
for(i in seq_along(lst1)) {
   for(j in seq_along(lst1[[i]]))  {
   assign(paste0(names(lst1)[i], "_", names(lst1[[i]][j])), lst1[[i]][[j]])
  }
}

-checking for objects created in the global env

> ls(pattern = "^df_\\d _[a-z] $")
 [1] "df_1_a"  "df_1_b"  "df_1_c"  "df_10_a" "df_10_b" "df_10_c" "df_2_a"  "df_2_b"  "df_2_c"  "df_3_a"  "df_3_b"  "df_3_c"  "df_4_a" 
[14] "df_4_b"  "df_4_c"  "df_5_a"  "df_5_b"  "df_5_c"  "df_6_a"  "df_6_b"  "df_6_c"  "df_7_a"  "df_7_b"  "df_7_c"  "df_8_a"  "df_8_b" 
[27] "df_8_c"  "df_9_a"  "df_9_b"  "df_9_c" 
  • Related