reorder column names by last character-CodePudding

I have df with following colname:

colname(df) gives:

"SUBJID" "EoT_A"  "EoT_B"  "EoT_C"  "EoT_D"  "PR_A"   "PR_B"   "PR_C"   "PR_D"  
"PD_A"   "PD_B"   "PD_C"   "PD_D"   "CR_A"   "CR_B"   "CR_C"   "CR_D"

I would like to reorder colname like:

"SUBJID" 
"EoT_A" "PR_A" "PD_A" "CR_A"
"EoT_B" "PR_B" "PD_B" "CR_B"
"EoT_C" "PR_C" "PD_C" "CR_C"
"EoT_D" "PR_D" "PD_D" "CR_D"

would there be a smart way to achieve this?

CodePudding user response：

You could use dplyr::ends_with, e.g.

df |> 
  dplyr::select(SUBJID, dplyr::ends_with(LETTERS[1:4])) |> 
  colnames()

 [1] "SUBJID" "EoT_A"  "PR_A"   "PD_A"   "CR_A"   "EoT_B"  "PR_B"   "PD_B"  
 [9] "CR_B"   "EoT_C"  "PR_C"   "PD_C"   "CR_C"   "EoT_D"  "PR_D"   "PD_D"  
[17] "CR_D"

CodePudding user response：

I don't know how smart it is, but you can do

df[c(1, order(sapply(strsplit(names(df), '_'), function(x) rev(x)[1])[-1])   1)]

for example, if your data frame looks like this:

df
#>   SUBJID EoT_A EoT_B EoT_C EoT_D PR_A PR_B PR_C PR_D PD_A PD_B PD_C PD_D CR_A CR_B CR_C CR_D
#> 1      1     2     3     4     5    6    7    8    9   10   11   12   13   14   15   16   17

Then the code puts your data into the required order:

df[c(1, order(sapply(strsplit(names(df), '_'), function(x) rev(x)[1])[-1])   1)]
#>   SUBJID EoT_A PR_A PD_A CR_A EoT_B PR_B PD_B CR_B EoT_C PR_C PD_C CR_C EoT_D PR_D PD_D CR_D
#> 1      1     2    6   10   14     3    7   11   15     4    8   12   16     5    9   13   17

CodePudding user response：

Another option using sub by extracting the last character after the last underscore and sort that alphabetically. To make sure the first column is not used you could add 1 to the sort to have it in the right order like this:

df[c(1, 1 order(sub('.*_', '', colnames(df[,-1]))))]
#>   SUBJID EoT_A PR_A PD_A CR_A EoT_B PR_B PD_B CR_B EoT_C PR_C PD_C CR_C EoT_D
#> 1      1     1    1    1    1     1    1    1    1     1    1    1    1     1
#>   PR_D PD_D CR_D
#> 1    1    1    1

^{Created on 2023-01-22 with reprex v2.0.2}

CodePudding user response：

Assuming x are your colnames, you can order them by the nchar.

c(x[1], x[-1][order(substring(x[-1], nchar(x[-1])))])
# [1] "SUBJID" "EoT_A"  "PR_A"   "PD_A"   "CR_A"   "EoT_B" 
# [7] "PR_B"   "PD_B"   "CR_B"   "EoT_C"  "PR_C"   "PD_C"  
# [13] "CR_C"   "EoT_D"  "PR_D"   "PD_D"   "CR_D"