Home > database >  R order list based on multiple characters from each item
R order list based on multiple characters from each item

Time:07-12

I'd like to sort a list based on more than the first character of each item in that list. The list contains chr data though some of those characters are digits. I've been trying to use a combination of substr() and order but to no avail.

For example:

mylist <- c('0_times','3-10_times','11_20_times','1-2_times','more_than_20_times')
mylist[order(substr(mylist,1,2))]

However, this results in 11-20_times being placed prior to 3-10_times:

[1] "0_times"            "1-2_times"          "11-20_times"        "3-10_times"         "more_than_20_times"

Update
To provide further detail on the use case.
My data is similar to the following:

mydf <- data.frame(X1=c("0_times","3-10_times", "11-20_times", "1-2_times","3-10_times",
                        "0_times","3-10_times", "11-20_times", "1-2_times","3-10_times" ),
                   X2=c('a','b','c','d','e','a','b','c','d','e'))

mydf2 <- data.frame(names = colnames(mydf))

mydf2$vals <- lapply(mydf, unique)

It is the vectors in mydf2$vals that I would like to sort. While the solution from @AllanCameron functions perfectly on a single vector, I'd like to apply that to each vector contained within mydf2$vals but cannot figure out how.

I have attempted to use unlist to access the lists contained but again can only do this on an individual row basis:

unlist(mydf2[1,'vals'], use.names=FALSE)

My inexperience evident here but I've been struggling with this all day.

CodePudding user response:

This requires a bit of string parsing and converting to numeric:

o <- sapply(strsplit(mylist, '\\D '), function(x) min(as.numeric(x[nzchar(x)])))
mylist[order(o)]
#> [1] "0_times"            "1-2_times"          "3-10_times"        
#> [4] "11_20_times"        "more_than_20_times"
  •  Tags:  
  • r
  • Related