Home > Enterprise >  How to name a list created by group_split WITHOUT breaking the pipeline IF there are several groupin
How to name a list created by group_split WITHOUT breaking the pipeline IF there are several groupin

Time:05-24

So, this is a follow-up to the previous question that I asked. After two years, I am now facing a similar problem, but I want to name the list elements according to level combinations.

Let's say that we are working with the mtcars dataset and that we do this:

mtcars %>% 
    group_split(gear, carb)

The resulting list's elements will not be named. However, I would like the first element to be named '3_1' since it contains cars that have gear 3 and carb 1, the second one should be named '3_2' and so on.

The solution from the previous question works only if you have one grouping level. If we have 2 like here (gear and carb), it doesn't seem to work because names()` must be the same length as x.

I have tried using paste and unite to force the group keys into a single column, but it doesn't seem to work. Any help would be appreciated.

CodePudding user response:

The group_split help page says "it works like base::split() but... does not name the elements". If you do want element names, I would recommend using base::split():

mtcars %>%
  split(., paste(.$gear, .$carb, sep = "_"))
# $`3_1`
#                 mpg cyl  disp  hp drat    wt  qsec vs am gear carb
# Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
# Valiant        18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
# Toyota Corona  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
# 
# $`3_2`
#                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
# Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
# Dodge Challenger  15.5   8  318 150 2.76 3.520 16.87  0  0    3    2
# AMC Javelin       15.2   8  304 150 3.15 3.435 17.30  0  0    3    2
# Pontiac Firebird  19.2   8  400 175 3.08 3.845 17.05  0  0    3    2
# ...

CodePudding user response:

You can achieve this by setting the grouping beforehand and then use group_keys() with interaction

library(dplyr)
mtcars <- mtcars %>%
  group_by(gear, carb)


mtcars %>%
  group_split(gear, carb) %>%
  purrr::set_names(interaction(group_keys(mtcars)))

Output:

$`3.1`
# A tibble: 3 x 11
    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
2  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
3  21.5     4  120.    97  3.7   2.46  20.0     1     0     3     1

$`3.2`
# A tibble: 4 x 11
    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1  18.7     8   360   175  3.15  3.44  17.0     0     0     3     2
2  15.5     8   318   150  2.76  3.52  16.9     0     0     3     2
3  15.2     8   304   150  3.15  3.44  17.3     0     0     3     2
4  19.2     8   400   175  3.08  3.84  17.0     0     0     3     2
  • Related