Home > database >  Arrange columns by prefix in column names
Arrange columns by prefix in column names

Time:03-02

I've got a dataframe with hundreds of columns, here's a simplified sample:

enter image description here

I need to arrange the order of specific columns so they are "kept together" based on the prefix in their column names, i.e. v_1, v_2, v_3 and spr_1, spr_2, spr_3 in the sample given. So the desired arrangement should look like this: enter image description here

Due to the huge amount of columns in original data frame, it's necessary to select the columns by specifying the prefix (e.g. "spr_" ) instead of explicitly selecting each column (e.g. c(spr_1, spr_2, spr_3)). If necessary, an approach with tidyverse is great because I already use the library.

Sample data:

library(tidyverse)

df <- data.frame(
  v_1 = c('A', 'B', 'C'),
  xyz = c(1,2,3),
  spr_1 = c('AA', 'BB', 'CC'),
  spr_2 = c('DD', 'EE', 'FF'),
  v_2 = c('D', 'E', 'F'),
  quert = c('X', 'G', 'T'),
  spr_3 = c('GG', 'HH', 'II'),
  v_3 = c('G', 'H', 'I')
)

CodePudding user response:

A possible solution:

library(dplyr)

df %>% 
  relocate(sort(names(.)))

#>   quert spr_1 spr_2 spr_3 v_1 v_2 v_3 xyz
#> 1     X    AA    DD    GG   A   D   G   1
#> 2     G    BB    EE    HH   B   E   H   2
#> 3     T    CC    FF    II   C   F   I   3

CodePudding user response:

Another possible solution ordered in decreasing

df1<-df[,order(colnames(df),decreasing = F)]

Output:

  quert spr_1 spr_2 spr_3 v_1 v_2 v_3 xyz
1     X    AA    DD    GG   A   D   G   1
2     G    BB    EE    HH   B   E   H   2
3     T    CC    FF    II   C   F   I   3

CodePudding user response:

Another way would be:

df[order(names(df))] -> df

#>   quert spr_1 spr_2 spr_3 v_1 v_2 v_3 xyz
#> 1     X    AA    DD    GG   A   D   G   1
#> 2     G    BB    EE    HH   B   E   H   2
#> 3     T    CC    FF    II   C   F   I   3

CodePudding user response:

or, since a dataframe is a list with sortable names, more classically:

df %>%
    .[order(names(.))]

the dot . in the %>% pipeline stands for the incoming dataframe

CodePudding user response:

If you have a large table, you might consider data.table. This will actually relocate the columns in place

data.table::setcolorder(df,sort(names(df)))
  • Related