how to group_by one variable and count based on another variable?-CodePudding

Is it possible to use group_by to group one variable and count the target variable based on another variable? For example,

x1	x2	x3
A	1	0
B	2	1
C	3	0
B	1	1
A	1	1

I want to count 0 and 1 of x3 with grouped x1

x1	x3=0	x3=1
A	1	1
B	0	2
C	1	0

Is it possible to use group_by and add something to summarize? I tried group_by both x1 and x3, but that gives x3 as the second column which is not what we are looking for.

If it's not possible to just use group_by, I was thinking we could group_by both x1 and x3, then split by x3 and cbind them, but the two dataframes after split have different lengths of rows, and there's no cbind_fill. What should I do to cbind them and fill the extra blanks?

CodePudding user response：

using the data.table package:

library(data.table)
dat <- as.data.table(dataset)
dat[, x3:= paste0("x3=", x3)]
result <- dcast(dat, x1~x3, value.var = "x3", fun.aggregate = length)

CodePudding user response：

A tidyverse approach to achieve your desired result using dplyr::count tidyr::pivot_wider:

library(dplyr)
library(tidyr)

df %>% 
  count(x1, x3) %>% 
  pivot_wider(names_from = "x3", values_from = "n", names_prefix = "x3=", values_fill = 0)
#> # A tibble: 3 × 3
#>   x1    `x3=0` `x3=1`
#>   <chr>  <int>  <int>
#> 1 A          1      1
#> 2 B          0      2
#> 3 C          1      0

DATA

df <- data.frame(
                x1 = c("A", "B", "C", "B", "A"),
                x2 = c(1L, 2L, 3L, 1L, 1L),
                x3 = c(0L, 1L, 0L, 1L, 1L)
)

CodePudding user response：

Yes, it is possible. Here is an example:

dat = read.table(text = "x1     x2  x3
 A  1   0
 B  2   1
 C  3   0
 B  1   1
 A  1   1", header = TRUE)

dat %>% group_by(x1) %>% 
        count(x3) %>% 
        pivot_wider(names_from = x3,
                    names_glue = "x3 = {x3}", 
                    values_from = n) %>% 
        replace(is.na(.),0)

# A tibble: 3 x 3
# Groups:   x1 [3]
# x1    `x3 = 0` `x3 = 1`
#  <chr>     <int>     <int>
#1 A             1         1
#2 B             0         2
#3 C             1         0