I would like to identify all unique values and last occurring instances of multiple values in a vector. For example, I would like to to identify the positions
c(2,3,4,6,7)
in the vector:
v <- c("m", "m", "k", "r", "l", "o", "l")
I see that
(duplicated(v) | duplicated(v, fromLast = T))
identifies all duplicated values, yet I would like to only identify the last occurring instances of duplicated elements.
How to achieve this without a loop?
CodePudding user response:
Do you need:
duplicated(v)
[1] FALSE TRUE FALSE FALSE FALSE FALSE TRUE
# and for index
which(duplicated(v))
[1] 2 7
or as akrun suggests:
which(!duplicated(v, fromLast = TRUE))
[1] 2 3 4 6 7
CodePudding user response:
You could do something like:
library(dplyr)
v %>%
as_tibble() %>%
mutate(index = row_number()) %>%
group_by(value) %>%
mutate(id=row_number()) %>%
filter(id == max(id))
Which gives us:
# A tibble: 5 × 3
# Groups: value [5]
value index id
<chr> <int> <int>
1 m 2 2
2 k 3 1
3 r 4 1
4 o 6 1
5 l 7 2
Additionally, if you just want the index, you can do:
v %>%
as_tibble() %>%
mutate(index = row_number()) %>%
group_by(value) %>%
mutate(id=row_number()) %>%
filter(id == max(id)) %>%
pull(index)
...to get:
[1] 2 3 4 6 7
CodePudding user response:
We can try
> sort(tapply(seq_along(v), v, max))
m k r o l
2 3 4 6 7
or
> unique(ave(seq_along(v), v, FUN = max))
[1] 2 3 4 7 6
or
> rev(length(v) - which(!duplicated(rev(v))) 1)
[1] 2 3 4 6 7