Home > database >  compact way to create filtered dataframes cycling through different values (R)
compact way to create filtered dataframes cycling through different values (R)

Time:06-09

I have a dataframe with a column named CHR which has discrete values from 1 to 18 (1, 2, 3 ...)

I want to subset the dataframes for each value of CHR. So far my code (working) looks like this:

CH1<-boxplot %>% filter(CHR == "1")
CH2<-boxplot %>% filter(CHR == "2")
CH3<-boxplot %>% filter(CHR == "3")
               .
               .
               .
CH18<-boxplot %>% filter(CHR == "18")

It does get the job done, but I'm very annoyed whenever my code looks like that. I want to learn one "proper" way so I can apply it to multiple other similar cases.

CodePudding user response:

You have a few options:

1. Write a function, although you will still have many lines, they are condensed lines.

bx_filter <- function(boxplot, chr) {
  boxplot %>% filter(CHR == chr)
}

CH1 <- bx_filter("1")
CH2 <- bx_filter("2")

2. Use split(), where you'll get a list and each element of the list has the data frames you're looking for

split(boxplot, boxplot$CHR)

3. A combo of map() and assign(), although it's generally frowned upon to write to the Global environment in ways similar to this

unique(boxplot$CHR) %>%
  map(function(chr) {
    assign(paste0('CH', chr), boxplot %>% filter(CHR == chr), envir = .GlobalEnv)
  })

CodePudding user response:

group_split is one option:

library(tidyverse)

list <- tibble(chr = rep(1:18, 2), value = 1:36) |> 
  group_split(chr)

head(list)
#> <list_of<
#>   tbl_df<
#>     chr  : integer
#>     value: integer
#>   >
#> >[6]>
#> [[1]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     1     1
#> 2     1    19
#> 
#> [[2]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     2     2
#> 2     2    20
#> 
#> [[3]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     3     3
#> 2     3    21
#> 
#> [[4]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     4     4
#> 2     4    22
#> 
#> [[5]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     5     5
#> 2     5    23
#> 
#> [[6]]
#> # A tibble: 2 × 2
#>     chr value
#>   <int> <int>
#> 1     6     6
#> 2     6    24

Created on 2022-06-08 by the reprex package (v2.0.1)

CodePudding user response:

Loop over the CHR var

lapply(boxplot$CHR, function(i) filter(boxplot, CHR = i)
  • Related