i try to save grouped counts of various factor levels into a new variable:
Lets say my data look like this:
a <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4)
b <- c("acc", "rej", "con", "acc", "rej", "con", "acc", "rej", "con", "acc", "rej", "con", "acc", "rej", "con","acc", "rej", "con", "acc", "rej")
df <- data.frame(a,b)
The resulting data frame should look like this:
a <- c(1,2,3,4)
number_acc <- c(2,2,1,2)
number_rej <- c(2,1,2,2)
number_con <- c(1,2,2,1)
I tried to solve the problem in the following way:
df2 <- df %>%
group_by(a) %>%
mutate(number_acc = count(b == 'acc'),
number_rej = count(b == 'rej'),
number_con = count(b == 'con'))
However, i get an error message that the method "count" cannot be applied to objects of the class "logical".
Thank you for your help!
CodePudding user response:
Use the tabyl
function from the janitor package:
Your data:
a <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4)
b <- c("acc", "rej", "con", "acc", "rej", "con", "acc", "rej", "con", "acc", "rej", "con", "acc", "rej", "con","acc", "rej", "con", "acc", "rej")
df <- data.frame(a,b)
Summarize grouping by count:
library(janitor)
data_summary <- tabyl(df, a, b)
data_summary
# a acc con rej
# 1 2 1 2
# 2 2 2 1
# 3 1 2 2
# 4 2 1 2
CodePudding user response:
To make the existing code to work, we need to summarise instead of mutate, and sum instead of count:
df %>%
group_by(a) %>%
summarise(number_acc = sum(b == 'acc'),
number_rej = sum(b == 'rej'),
number_con = sum(b == 'con'))
# # A tibble: 4 x 4
# a number_acc number_rej number_con
# <dbl> <int> <int> <int>
# 1 1 2 2 1
# 2 2 2 1 2
# 3 3 1 2 2
# 4 4 2 2 1
But there are better ways of doing this, for example see answers at:
- Frequency count of two column in R
- dplyr : how to get two-way tables with marginal proportions?
- Count multiple columns and group by in R
CodePudding user response:
Here is an alternative way:
We could use pivot_wider
with names_glue
after count
:
library(tidyr)
library(dplyr)
df %>%
count(a,b) %>%
pivot_wider(
names_from = b,
values_from = n,
names_glue = "{b}_{'number'}"
)
a acc_number con_number rej_number
<dbl> <int> <int> <int>
1 1 2 1 2
2 2 2 2 1
3 3 1 2 2
4 4 2 1 2