I am a novice in R and have a data with two fields. I need to count the number of times the first field element appears in the second field. The second field can contain more one element due to which the below code isn't giving the right answer. Please tell how to modify this or what function can I use here. The count for A1 should be 3 but it is coming as 1 since the presence of A1 in A1;A2 and A3;A1 are not recognized in this code. Thanks.
df0 <- data.frame (ID = c("A1", "A2", "A3", "A4", "B1", "C1", "D1"),
Refer = c(" ", " ", "A1", "A1;A2", "A3;A1", "A2","A2;C1")
)
n1 <- nrow(df0)
df1 = data.frame(matrix(
vector(), 0, 2, dimnames=list(c(), c("ID","Count"))),
stringsAsFactors=F)
for (i in 1:n1){
id <- df0$ID[i]
df2 <- filter(df0, Refer == id) # This assumes only a single ID can be there in Refer
n2 <- nrow(df2)
df1[i,1] <- id
df1[i,2] <- n2
}
CodePudding user response:
You are almost there. Although, you should use grepl()
instead of exact filtering Refer == id
.
library(dplyr)
df0 <- data.frame (ID = c("A1", "A2", "A3", "A4", "B1", "C1", "D1"),
Refer = c(" ", " ", "A1", "A1;A2", "A3;A1", "A2","A2;C1")
)
result <- lapply(df0$ID, function(x){
n = df0 %>% filter(grepl(x, Refer)) %>% nrow
data.frame(ID = x, count = n)
}) %>%
bind_rows
CodePudding user response:
You might strsplit
"Refer"
at ;
and unlist
it. Next create a factor
out of it using "Id"
as levels and simply table
the result.
table(factor(unlist(strsplit(df0$Refer, ';')), levels=df0$ID))
# A1 A2 A3 A4 B1 C1 D1
# 3 3 1 0 0 1 0
CodePudding user response:
Here is a tidyverse
solution:
df0 %>%
separate_rows(Refer) %>%
mutate(x = str_detect(Refer, pattern)) %>%
filter(x == TRUE) %>%
count(Refer)
Refer n
<chr> <int>
1 A1 3
2 A2 3
3 A3 1
4 C1 1