Home > front end >  Obtaining frequencies of values from one column associated with another value from the same row in a
Obtaining frequencies of values from one column associated with another value from the same row in a

Time:09-23

I have a following data frame:

df <- data.frame(Org_name = c("A", "B", "C", "D", "A", "D", "C", "B"),
                 Symbol = c("F", "P", "X", "F", "F", "O", "O", "P"))

It contains four values of "Org_name" (A, B, C, D) variable associated with one of the four values of "Symbol" variable (F, P, X, O).

I want to count how many of either F, P, X or O from column "Symbol" are associated with A, B, C, or D from column "Orga_name" in the said data frame.

The result should look like this:

df1 <- data.frame(Org_name = c("A", "B", "C", "D"),
                  F_freq = c("2", "0", "0", "1"),
                  P_freq = c("0", "2", "0", "0"),
                  O_freq = c("0", "0", "1", "1"),
                  X_freq = c("0", "0", "1", "0"))

I am trying to wrap my head around this, but after few hours of attempts I am nowhere near the end result.

Any advice is welcomed.

CodePudding user response:

Does this work:

library(dplyr)
library(tidyr)

df %>% count(Org_name, Symbol) %>% 
     pivot_wider(Org_name, names_from = Symbol, values_from = n, values_fill = 0)
# A tibble: 4 × 5
  Org_name     F     P     O     X
  <chr>    <int> <int> <int> <int>
1 A            2     0     0     0
2 B            0     2     0     0
3 C            0     0     1     1
4 D            1     0     1     0

CodePudding user response:

In base R:

tab <- as.data.frame.matrix(table(df$Org_name, df$Symbol))
cbind(Org_name = rownames(tab), data.frame(tab, row.names=NULL))

Another option is janitor::tabyl:

df %>% 
  janitor::tabyl(Org_name, Symbol)

output

 Org_name F O P X
        A 2 0 0 0
        B 0 0 2 0
        C 0 1 0 1
        D 1 1 0 0
  •  Tags:  
  • r
  • Related