I have two slightly different types of lists that I need to sort; however, I only need to sort portions of the list while keeping some elements in place (i.e., their index should stay the same).
First, let's say that I have a list of numbers:
x <- c(4, 8, 1, 7, 3, 0, 5, 2, 6, 9)
I know that if I only wanted to sort the first 5 elements, then I could do something like this:
x[1:5] <- sort(x[1:5])
x
# [1] 1 3 4 7 8 0 5 2 6 9
Second, if I wanted to sort a list, but keep NAs in place, then I could do something like this (though I'm sure there's a better way to do this):
y <- c(4, 8, 1, NA, NA, 7, 3, 0, 5, 2, NA, 6, NA, 9)
y[which(is.na(y)==FALSE)] <- sort(y[which(is.na(y)==FALSE)])
y
# [1] 0 1 2 NA NA 3 4 5 6 7 NA 8 NA 9
Question: How do I sort a list with alphanumeric characters by group? So, I want to first sort the list by a pre-defined letter order (i.e., c(C, A, B)
), then numerically by group, but leave NAs in their original index position?
z <- c('B' , 'B1', 'B11', 'B2', 'A', 'C50', 'B21', NA, 'A5',
'B22', 'C', NA, 'C1', 'C11', NA, NA, 'C2', NA)
Expected Output
c('C', 'C1', 'C2', 'C11', 'C50', 'A', 'A5', NA, 'B', 'B1', 'B2', NA, 'B11', 'B21', NA, NA, 'B22', NA)
# [1] "C" "C1" "C2" "C11" "C50" "A" "A5" NA "B" "B1" "B2" NA "B11" "B21" NA NA "B22" NA
I know that if I just wanted to sort alphabetically, then I could just use the same code as above. However, these also do not sort correctly numerically.
z[which(is.na(z)==FALSE)] <- sort(z[which(is.na(z)==FALSE)])
z
# [1] "A" "A5" "B" "B1" "B11" "B2" "B21" NA "B22" "C" "C1" NA "C11" "C2" NA NA "C50" NA
However, I'm not sure how to change the order of the letters to c(C, A, B)
since these are alphanumeric and to correctly sort numerically. I know that I could use order
and match
:
f <- sort(z[which(is.na(z)==FALSE)])
z[which(is.na(z)==FALSE)] <- f[order(match(f, c("C","A","B")))]
# [1] "C" "A" "B" "A5" "B1" "B11" "B2" NA "B21" "B22" "C1" NA "C11" "C2" NA NA "C50" NA
But that would only change if there is a perfect match (hence only C, A, and B move to the beginning of the list and the groups are then lost), and it would not be prudent to have to give the complete alphanumeric list to match
. I'm sure there's an easy way to do this (e.g., grepl
), but am unsure how to implement it.
CodePudding user response:
Below function, creates an index for non-NA elements ('i1'), extract the letters from the subset of the vector, convert to a factor
with levels
specified in the custom order, extract the digits, order
the non-NA elements the extracted vectors and assign back, return the updated vector
f1 <- function(vec) {
i1 <- !is.na(vec)
v1 <- factor(sub("\\d ", "", vec[i1]), levels = c("C", "A", "B"))
v2 <- sub("\\D ", "", vec[i1])
v2[!nzchar(v2)] <- 0
v2 <- as.numeric(v2)
vec[i1] <- vec[i1][order(v1, v2)]
vec
}
-testing
f1(z)
[1] "C" "C1" "C2" "C11" "C50" "A" "A5" NA "B" "B1" "B2" NA "B11" "B21" NA NA "B22" NA