Home > Mobile >  Sort columns by a specific criteria in R
Sort columns by a specific criteria in R

Time:07-21

I want to sort some dataframes using a specific pattern criteria. I have some columns with prefix 'Saldo' and 'Concessões' both with suffix 'Real' and 'PIB'. I want to put them in the specific order: Saldo...PIB, Concessões...PIB, Saldo...Real, Concessões...Real.

Initial column names:

Saldo...PIB | Saldo...Real | Concessões...PIB | Concessões...Real

Desired output:

Saldo...PIB | Concessões...PIB | Saldo...Real | Concessões...Real

I've tried some combinations of select() and matches() but Im not so good at regular expressions. Thanks in advance!

CodePudding user response:

Although you can use regular expressions, you can also make use of ends_with, which might be easier. Here is a solution where df is your dataframe.

library(dplyr)

df |>
  select(ends_with("PIB"), ends_with("Real"))

CodePudding user response:

Your idea to use select was just fine! I created some example data to illustrate my answer.

library(tidyverse)

df <- data.frame(SaldoReal = 1, SaldoPIB = 2, ConcessionsReal = 3, ConcessionsPIB = 4)

df %>% select(ends_with("PIB"), ends_with("Real"))

Results:

  SaldoPIB ConcessionsPIB SaldoReal ConcessionsReal
1        2              4         1               3

Hope this helps.

CodePudding user response:

Not really sure about my approach - really need to focus on RegEx in the future - and if it really fits your needs, but it seems to work with your example given.

library(stringr)

# you can extract the column names of your df with `colnames()`
names <- c("Saldo...PIB", "Saldo...Real", "Concessões...PIB", "Concessões...Real")

ind_pib <- stringr::str_detect(names, pattern = "PIB")
ind_real <- stringr::str_detect(names, pattern = "Real")

c(names[ind_pib], names[ind_real])
#> [1] "Saldo...PIB"       "Concessões...PIB"  "Saldo...Real"     
#> [4] "Concessões...Real"
  • Related