I want to sort some dataframes using a specific pattern criteria. I have some columns with prefix 'Saldo' and 'Concessões' both with suffix 'Real' and 'PIB'. I want to put them in the specific order: Saldo...PIB, Concessões...PIB, Saldo...Real, Concessões...Real.
Initial column names:
Saldo...PIB | Saldo...Real | Concessões...PIB | Concessões...Real
Desired output:
Saldo...PIB | Concessões...PIB | Saldo...Real | Concessões...Real
I've tried some combinations of select() and matches() but Im not so good at regular expressions. Thanks in advance!
CodePudding user response:
Although you can use regular expressions, you can also make use of ends_with
, which might be easier. Here is a solution where df
is your dataframe.
library(dplyr)
df |>
select(ends_with("PIB"), ends_with("Real"))
CodePudding user response:
Your idea to use select
was just fine!
I created some example data to illustrate my answer.
library(tidyverse)
df <- data.frame(SaldoReal = 1, SaldoPIB = 2, ConcessionsReal = 3, ConcessionsPIB = 4)
df %>% select(ends_with("PIB"), ends_with("Real"))
Results:
SaldoPIB ConcessionsPIB SaldoReal ConcessionsReal
1 2 4 1 3
Hope this helps.
CodePudding user response:
Not really sure about my approach - really need to focus on RegEx in the future - and if it really fits your needs, but it seems to work with your example given.
library(stringr)
# you can extract the column names of your df with `colnames()`
names <- c("Saldo...PIB", "Saldo...Real", "Concessões...PIB", "Concessões...Real")
ind_pib <- stringr::str_detect(names, pattern = "PIB")
ind_real <- stringr::str_detect(names, pattern = "Real")
c(names[ind_pib], names[ind_real])
#> [1] "Saldo...PIB" "Concessões...PIB" "Saldo...Real"
#> [4] "Concessões...Real"