my input:
df<-data.frame("frame"=c(1,2,3,4,5,6,7,8,9,10),
"label_x"=c("AO","Other","AO","GS","GS","RF","RF","TI",NA,"Other"),
"label_y"=c("AO","RF","RF", "GS","GS","Other","Other","TI","AO","RF"),
"cross"=c("Matched","Mismatched", "Mismatched","Matcehed","Matched"
,"Mismatched", "Mismatched","Mismatched","Mismatched","Mismatched") )
I want to count all "Matches/Mismatches" from column cross
per label, for column label_x
and label_y
(both). So I tried this code for each column label_
:
df %>% filter(!is.na(label_y )) %>% group_by(label_y) %>% count(cross)
but it doesn't answer my question, after that I need to sum the counts for each column . So I expect something like this...:
label Mismatching Matching Total
AO 5 7 13
RF 3 4 7
CodePudding user response:
On way to do it:
df %>% pivot_longer(cols = c(label_x ,label_y), values_to = "label") %>%
group_by(label) %>% count(cross) %>%
pivot_wider(values_from = n, names_from = cross, values_fill = 0) %>%
mutate(total = Matched Mismatched)
Result tibble:
# A tibble: 6 x 4
# Groups: label [6]
label Matched Mismatched total
<chr> <int> <int> <int>
1 AO 2 2 4
2 GS 4 0 4
3 Other 0 4 4
4 RF 0 5 5
5 TI 0 2 2
6 NA 0 1 1
However, keep in mind that the matched number is overestimated because both label_x and label_y have been used. Could you show a result table with the real labels and number you expect ?
CodePudding user response:
Using table:
table(data.frame(label = unlist(df[, c("label_x", "label_y")]),
cross = df$cross))
# cross
#label Matcehed Matched Mismatched
# AO 0 2 2
# GS 2 2 0
# Other 0 0 4
# RF 0 0 5
# TI 0 0 2