I need some help in sorting data that has more than 100 objects and thousands of variables and I want to sort them according to numeric order. (E.g. 1,2,3,4)
This is a sample Dataframe
print(df)
P1 P5 P3 P2 P15
1 1 11 6 21 35
2 2 12 7 22 34
3 3 13 8 23 33
4 4 14 9 24 32
5 5 15 10 25 31
6 6 16 11 26 30
7 7 17 12 27 29
8 8 18 13 28 28
9 9 19 14 29 27
10 10 20 15 30 26
But I want the columns to be ordered by their names like below
P1 P2 P3 P5 P15
1 1 21 6 11 35
2 2 22 7 12 34
3 3 23 8 13 33
4 4 24 9 14 32
5 5 25 10 15 31
6 6 26 11 16 30
7 7 27 12 17 29
8 8 28 13 18 28
9 9 29 14 19 27
10 10 30 15 20 26
And also I am not using any libraries which basically is base-R.
CodePudding user response:
You may use mixedorder
/mixedsort
from gtools
.
df <- data.frame(P10 = rnorm(5), P1 = rnorm(5), P15 = rnorm(5))
df[gtools::mixedsort(names(df))]
# P1 P10 P15
#1 0.2306336 -0.3343381 0.9183412
#2 -1.6918624 -0.1055599 -0.4527006
#3 0.6597919 -0.7305097 -1.7483723
#4 -1.0236236 1.9050436 1.7699041
#5 -0.8915216 0.3326217 -2.3774069
CodePudding user response:
Since you have a letter at the beginning of the column name, you can't just sort by names, but have use a string sort.
library(stringr)
df <- data.frame(matrix(runif(16), ncol = 8, nrow = 2))
names(df) <- c("P100", "R3", "P1", "P2", "P10", "P11", "P100C", "PP9")
df
# P100 R3 P1 P2 P10 P11 P100C PP9
# 1 0.1774975 0.5982241 0.6745210 0.31186217 0.5893895 0.2070617 0.4891480 0.5275640
# 2 0.8936577 0.7666618 0.1734269 0.09079242 0.9984647 0.7152058 0.9342534 0.7952669
df[, str_sort(names(df), numeric = TRUE)]
# P1 P2 P10 P11 P100 P100C PP9 R3
# 1 0.6745210 0.31186217 0.5893895 0.2070617 0.1774975 0.4891480 0.5275640 0.5982241
# 2 0.1734269 0.09079242 0.9984647 0.7152058 0.8936577 0.9342534 0.7952669 0.7666618