I have two long lists, one of them is a consecutive subset of another. Example:
full= c("cat", "dog", "giraffe", "gorilla", "opossum", "rat")
subset= c("giraffe", "gorilla", "opossum")
Is there an elegant way to get the index of where match starts, ends, or both? In the example above, I would like to get 3 out since it is the index of "giraffe" in full text?
To clarify, if subset= c("giraffe", "rat", "gorilla", "opossum")
the output should be NA.
CodePudding user response:
zoo::rollapply(full, 3, FUN = identical, subset)
# [1] FALSE FALSE TRUE FALSE
which(zoo::rollapply(full, 3, FUN = identical, subset))[1]
# [1] 3
zoo::rollapply(full, 3, FUN = func, c("giraffe", "rat", "gorilla", "opossum"))
# [1] FALSE FALSE FALSE FALSE
which(zoo::rollapply(full, 3, FUN = identical, c("giraffe", "rat", "gorilla", "opossum")))[1]
# [1] NA
CodePudding user response:
We may need match
with a condition
f1 <- function(subvec, fullvec) {
i1 <- match(subvec, fullvec, nomatch = 0)
if(any(diff(i1) != 1)) NA else i1[1]
}
-testing
> f1(subset, full)
[1] 3
> f1(subset2, full)
[1] NA
> f1(subset[c(1, 3)], full)
[1] NA
data
full <- c("cat", "dog", "giraffe", "gorilla", "opossum", "rat")
subset <- c("giraffe", "gorilla", "opossum")
subset2 <- c("giraffe", "rat", "gorilla", "opossum")