I have the following df:
data <- data.frame(group = c("A", "A", "A", "B", "C", "C", "C", "C"),
x = c("A1", "A12", "A123", "BA", "C12", "CA", "C123", "C132"),
y = c("ir1", "ir2", "ir3", "ir4", "ir5", "ir6", "ir7", "ir8"))
data
> data
group x y
1 A A1 ir1
2 A A12 ir2
3 A A123 ir3
4 B BA ir4
5 C C12 ir5
6 C CA ir6
7 C C123 ir7
8 C C132 ir8
and I would like to group by the group variable and select the x values that have the minimum characters. The required output is
group x y
1 A A1 ir1
2 B BA ir4
3 C CA ir6
Thank you
CodePudding user response:
Try this
library(dplyr)
data |> group_by(group) |>
summarise(x = x[which.min(nchar(x))] ,
y = y[which.min(nchar(x))]) |> ungroup()
- output
# A tibble: 3 × 3
group x y
<chr> <chr> <chr>
1 A A1 ir1
2 B BA ir4
3 C CA ir5
CodePudding user response:
A nice succinct option is using dplyr::slice_min(nchar(...))
:
library(dplyr)
data <- data.frame(group = c("A", "A", "A", "B", "C", "C", "C", "C"),
x = c("A1", "A12", "A123", "BA", "C12", "CA", "C123", "C132"),
y = c("ir1", "ir2", "ir3", "ir4", "ir5", "ir6", "ir7", "ir8"))
data %>%
group_by(group) %>%
slice_min(nchar(x)) %>%
ungroup()
#> # A tibble: 3 × 3
#> group x y
#> <chr> <chr> <chr>
#> 1 A A1 ir1
#> 2 B BA ir4
#> 3 C CA ir6
Created on 2022-07-27 by the reprex package (v2.0.1)