I have a following data frame:
df <- data.frame(Org_name = c("A", "B", "C", "D", "A", "D", "C", "B"),
Symbol = c("F", "P", "X", "F", "F", "O", "O", "P"))
It contains four values of "Org_name" (A, B, C, D) variable associated with one of the four values of "Symbol" variable (F, P, X, O).
I want to count how many of either F, P, X or O from column "Symbol" are associated with A, B, C, or D from column "Orga_name" in the said data frame.
The result should look like this:
df1 <- data.frame(Org_name = c("A", "B", "C", "D"),
F_freq = c("2", "0", "0", "1"),
P_freq = c("0", "2", "0", "0"),
O_freq = c("0", "0", "1", "1"),
X_freq = c("0", "0", "1", "0"))
I am trying to wrap my head around this, but after few hours of attempts I am nowhere near the end result.
Any advice is welcomed.
CodePudding user response:
Does this work:
library(dplyr)
library(tidyr)
df %>% count(Org_name, Symbol) %>%
pivot_wider(Org_name, names_from = Symbol, values_from = n, values_fill = 0)
# A tibble: 4 × 5
Org_name F P O X
<chr> <int> <int> <int> <int>
1 A 2 0 0 0
2 B 0 2 0 0
3 C 0 0 1 1
4 D 1 0 1 0
CodePudding user response:
In base R:
tab <- as.data.frame.matrix(table(df$Org_name, df$Symbol))
cbind(Org_name = rownames(tab), data.frame(tab, row.names=NULL))
Another option is janitor::tabyl
:
df %>%
janitor::tabyl(Org_name, Symbol)
output
Org_name F O P X
A 2 0 0 0
B 0 0 2 0
C 0 1 0 1
D 1 1 0 0