Home > Enterprise >  Group by a var and filter by an other var on condition, in r
Group by a var and filter by an other var on condition, in r

Time:07-28

I have the following df:

data <- data.frame(group = c("A", "A", "A",  "B", "C", "C", "C", "C"),
                   x = c("A1", "A12", "A123", "BA", "C12", "CA", "C123", "C132"),
                   y = c("ir1", "ir2", "ir3",  "ir4", "ir5", "ir6", "ir7", "ir8"))     
data                   

> data                   
  group    x   y
1     A   A1 ir1
2     A  A12 ir2
3     A A123 ir3
4     B   BA ir4
5     C  C12 ir5
6     C   CA ir6
7     C C123 ir7
8     C C132 ir8

and I would like to group by the group variable and select the x values that have the minimum characters. The required output is

  group  x   y
1     A A1 ir1
2     B BA ir4
3     C CA ir6

Thank you

CodePudding user response:

Try this

library(dplyr)

data |> group_by(group) |>
summarise(x = x[which.min(nchar(x))] , 
y = y[which.min(nchar(x))]) |> ungroup()
  • output
# A tibble: 3 × 3
  group x     y    
  <chr> <chr> <chr>
1 A     A1    ir1  
2 B     BA    ir4  
3 C     CA    ir5  

CodePudding user response:

A nice succinct option is using dplyr::slice_min(nchar(...)):

library(dplyr)

data <- data.frame(group = c("A", "A", "A",  "B", "C", "C", "C", "C"),
                   x = c("A1", "A12", "A123", "BA", "C12", "CA", "C123", "C132"),
                   y = c("ir1", "ir2", "ir3",  "ir4", "ir5", "ir6", "ir7", "ir8")) 

data %>% 
  group_by(group) %>% 
  slice_min(nchar(x)) %>% 
  ungroup()
#> # A tibble: 3 × 3
#>   group x     y    
#>   <chr> <chr> <chr>
#> 1 A     A1    ir1  
#> 2 B     BA    ir4  
#> 3 C     CA    ir6

Created on 2022-07-27 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related