Home > OS >  reorder column names by last character
reorder column names by last character

Time:01-23

I have df with following colname:

colname(df) gives:

"SUBJID" "EoT_A"  "EoT_B"  "EoT_C"  "EoT_D"  "PR_A"   "PR_B"   "PR_C"   "PR_D"  
"PD_A"   "PD_B"   "PD_C"   "PD_D"   "CR_A"   "CR_B"   "CR_C"   "CR_D"

I would like to reorder colname like:

"SUBJID" 
"EoT_A" "PR_A" "PD_A" "CR_A"
"EoT_B" "PR_B" "PD_B" "CR_B"
"EoT_C" "PR_C" "PD_C" "CR_C"
"EoT_D" "PR_D" "PD_D" "CR_D"            

would there be a smart way to achieve this?

CodePudding user response:

You could use dplyr::ends_with, e.g.

df |> 
  dplyr::select(SUBJID, dplyr::ends_with(LETTERS[1:4])) |> 
  colnames()

 [1] "SUBJID" "EoT_A"  "PR_A"   "PD_A"   "CR_A"   "EoT_B"  "PR_B"   "PD_B"  
 [9] "CR_B"   "EoT_C"  "PR_C"   "PD_C"   "CR_C"   "EoT_D"  "PR_D"   "PD_D"  
[17] "CR_D" 

CodePudding user response:

I don't know how smart it is, but you can do

df[c(1, order(sapply(strsplit(names(df), '_'), function(x) rev(x)[1])[-1])   1)]

for example, if your data frame looks like this:

df
#>   SUBJID EoT_A EoT_B EoT_C EoT_D PR_A PR_B PR_C PR_D PD_A PD_B PD_C PD_D CR_A CR_B CR_C CR_D
#> 1      1     2     3     4     5    6    7    8    9   10   11   12   13   14   15   16   17

Then the code puts your data into the required order:

df[c(1, order(sapply(strsplit(names(df), '_'), function(x) rev(x)[1])[-1])   1)]
#>   SUBJID EoT_A PR_A PD_A CR_A EoT_B PR_B PD_B CR_B EoT_C PR_C PD_C CR_C EoT_D PR_D PD_D CR_D
#> 1      1     2    6   10   14     3    7   11   15     4    8   12   16     5    9   13   17

CodePudding user response:

Another option using sub by extracting the last character after the last underscore and sort that alphabetically. To make sure the first column is not used you could add 1 to the sort to have it in the right order like this:

df[c(1, 1 order(sub('.*_', '', colnames(df[,-1]))))]
#>   SUBJID EoT_A PR_A PD_A CR_A EoT_B PR_B PD_B CR_B EoT_C PR_C PD_C CR_C EoT_D
#> 1      1     1    1    1    1     1    1    1    1     1    1    1    1     1
#>   PR_D PD_D CR_D
#> 1    1    1    1

Created on 2023-01-22 with reprex v2.0.2

CodePudding user response:

Assuming x are your colnames, you can order them by the nchar.

c(x[1], x[-1][order(substring(x[-1], nchar(x[-1])))])
# [1] "SUBJID" "EoT_A"  "PR_A"   "PD_A"   "CR_A"   "EoT_B" 
# [7] "PR_B"   "PD_B"   "CR_B"   "EoT_C"  "PR_C"   "PD_C"  
# [13] "CR_C"   "EoT_D"  "PR_D"   "PD_D"   "CR_D"  
  •  Tags:  
  • r
  • Related