tab1 <- data.frame(id = c(1, 42, 2, 88, 432, 9584), name = c("apple", "banana",
"apple", "mango",
"mango", "apple"))
> tab1
id name
1 1 apple
2 42 banana
3 2 apple
4 88 mango
5 432 mango
6 9584 apple
I have a data.frame
named tab1
that contains the dictionary for finding the id
s associated with different patterns.
For example, suppose I want to find the id
s that are associated with the pattern "apple"
> tab1[which(tab1$name %in% "apple"), ]$id
[1] 1 2 9584
or with the pattern "mango"
,
> tab1[which(tab1$name %in% "mango"), ]$id
[1] 88 432
I would like to store these id
s in a new data.frame
where the id
s are separated by |
like this:
pattern id
1 apple 1|2|9584
2 mango 88|432
3 peach NA
Suppose I have a very long list of patterns (say, over 1 million patterns) that I want to match with those in tab1
, what's a quick way of doing this in R without relying on for loops?
CodePudding user response:
try this:
tab1 <- data.frame(id = c(1, 42, 2, 88, 432, 9584), name = c("apple", "banana",
"apple", "mango",
"mango", "apple"))
pattern<-c("apple","mango","peach","orange")
rbind(aggregate(.~name,data=tab1[tab1$name %in% pattern,] ,FUN = function(x){paste0(x,collapse = "|")}),
data.frame(name=pattern[!pattern %in% tab1$name],id=NA)
)
name id
1 apple 1|2|9584
2 mango 88|432
3 peach <NA>
4 orange <NA>