I have a data.frame that looks like df
.
I want to sort the genes columns so that they start with the AT1G...
pattern.
library(tidyverse)
df <- tibble(genes=c("18S","ACLA","AT1G25240","AT1G25241","AT1G25242"), functions=c("ribosome","dunno","flowering","O2","photosynthesis"))
df
#> # A tibble: 5 × 2
#> genes functions
#> <chr> <chr>
#> 1 18S ribosome
#> 2 ACLA dunno
#> 3 AT1G25240 flowering
#> 4 AT1G25241 O2
#> 5 AT1G25242 photosynthesis
Created on 2022-09-28 with reprex v2.0.2
I want my data to look like this:
genes functions
AT1G25240 flowering
AT1G25241 O2
AT1G25242 photosynthesis
ACLA dunno
18S ribosome
Any idea or help is highly appreciated it! The rationale is that I want from a huge data set to see first the core genes that start with AT..
CodePudding user response:
If you sort (arrange
) by the presence of the pattern using grepl
, then FALSE
(pattern not found) sorts first. If we negate that pattern, we get what you want:
df %>%
arrange(!grepl("^AT1G", genes))
# # A tibble: 5 x 2
# genes functions
# <chr> <chr>
# 1 AT1G25240 flowering
# 2 AT1G25241 O2
# 3 AT1G25242 photosynthesis
# 4 18S ribosome
# 5 ACLA dunno
You can add other arguments to arrange
for secondary sorts, e.g., arrange(!grepl(..), genes, functions)
.