I have a data frame that I'd like to order based on a vector of IDs and on the all the columns of another data frame.
id.namestest = data.frame(test = NA, id= c("id1", "id2", "id3","id3", "id2", "id1"))
head(admix)
# V1 V2 V3
# [1,] 0.1019623 0.8961855 1.852222e-03
# [2,] 0.6891593 0.3107807 5.999776e-05
# [3,] 0.7274040 0.2697308 2.865165e-03
# [4,] 0.3458368 0.6514100 2.753215e-03
# [5,] 0.3946996 0.6053004 1.000000e-09
# [6,] 0.6383386 0.3585409 3.120463e-03
admix=structure(c(0.101962262250848, 0.68915927427333, 0.727404046114676,
0.345836796905855, 0.394699646563406, 0.638338623952938, 0.896185515801946,
0.310780727965854, 0.26973078933548, 0.65140998802539, 0.605300352436594,
0.358540912890725, 0.00185222194720621, 5.99977608165462e-05,
0.00286516454984352, 0.00275321506875506, 1e-09, 0.00312046315633649
), dim = c(6L, 3L), dimnames = list(NULL, c("V1", "V2", "V3")))
This below works, but I have to manually set the column order in admix:
admix.tmp = cbind(admix, id.namestest)
if (K==3) { admix.sort.tmp = admix.tmp[order(id.namestest[,2], admix[,1],admix[,2],admix[,3]),]}
I'd like to instead provide a vector of the order of columns sort.order
sort.order = c(1,2,3)
admix.sort.tmp = admix.tmp[order(id.namestest[,2], admix[,sort.order]),]
But I get this:
Error in order(id.namestest[, 2], admix[, c(1, 2, 3)]) :
argument lengths differ
I also tried:
admix.sort.tmp = admix.tmp[order(id.namestest[,2], asplit(admix, 2)),]
but I get the same error.
CodePudding user response:
As showed in the error, the id.namestest[,2]
is a vector with length 5, whereas the admix[, 1, 2, 3]
is a matrix and its length will the length of the number of elements in the matrix. We can create a list
and then use order
with do.call
admix.tmp[do.call(order, c(list(id.namestest[,2]), asplit(admix, 2))),]
-output
V1 V2 V3 test id
1 0.1019623 0.8961855 1.852222e-03 NA id1
6 0.6383386 0.3585409 3.120463e-03 NA id1
5 0.3946996 0.6053004 1.000000e-09 NA id2
2 0.6891593 0.3107807 5.999776e-05 NA id2
4 0.3458368 0.6514100 2.753215e-03 NA id3
3 0.7274040 0.2697308 2.865165e-03 NA id3
By creating a list
of vectors or a data.frame, the types of columns are intact
admix.tmp[do.call(order, cbind(id.namestest[2], admix)),]
V1 V2 V3 test id
1 0.1019623 0.8961855 1.852222e-03 NA id1
6 0.6383386 0.3585409 3.120463e-03 NA id1
5 0.3946996 0.6053004 1.000000e-09 NA id2
2 0.6891593 0.3107807 5.999776e-05 NA id2
4 0.3458368 0.6514100 2.753215e-03 NA id3
3 0.7274040 0.2697308 2.865165e-03 NA id3
Or using dplyr
library(dplyr)
admix.tmp %>%
arrange(id, across(all_of(colnames(admix[, sort.order, drop = FALSE]))))