Sort columns by a specific criteria in R-CodePudding

I want to sort some dataframes using a specific pattern criteria. I have some columns with prefix 'Saldo' and 'Concessões' both with suffix 'Real' and 'PIB'. I want to put them in the specific order: Saldo...PIB, Concessões...PIB, Saldo...Real, Concessões...Real.

Initial column names:

Saldo...PIB | Saldo...Real | Concessões...PIB | Concessões...Real

Desired output:

Saldo...PIB | Concessões...PIB | Saldo...Real | Concessões...Real

I've tried some combinations of select() and matches() but Im not so good at regular expressions. Thanks in advance!

CodePudding user response：

Although you can use regular expressions, you can also make use of ends_with, which might be easier. Here is a solution where df is your dataframe.

library(dplyr)

df |>
  select(ends_with("PIB"), ends_with("Real"))

CodePudding user response：

Your idea to use select was just fine! I created some example data to illustrate my answer.

library(tidyverse)

df <- data.frame(SaldoReal = 1, SaldoPIB = 2, ConcessionsReal = 3, ConcessionsPIB = 4)

df %>% select(ends_with("PIB"), ends_with("Real"))

Results:

  SaldoPIB ConcessionsPIB SaldoReal ConcessionsReal
1        2              4         1               3

Hope this helps.

CodePudding user response：

Not really sure about my approach - really need to focus on RegEx in the future - and if it really fits your needs, but it seems to work with your example given.

library(stringr)

# you can extract the column names of your df with `colnames()`
names <- c("Saldo...PIB", "Saldo...Real", "Concessões...PIB", "Concessões...Real")

ind_pib <- stringr::str_detect(names, pattern = "PIB")
ind_real <- stringr::str_detect(names, pattern = "Real")

c(names[ind_pib], names[ind_real])
#> [1] "Saldo...PIB"       "Concessões...PIB"  "Saldo...Real"     
#> [4] "Concessões...Real"