Is there an R function that can aggregate the count of a specific row in a categorical column?-CodePudding

I hope everyone is doing well. I am having a bit of a brain fart trying to aggregate in R. Lets say I have this df:

student	subject
Amber	math
Colin	math
Bob	science
Amber	math
Amber	science

And I want to get a count of the number of times the student's subject is math and add that to the data frame, so the result would look like this:

student	subject	total 'math'
Amber	math	2
Colin	math	1
Bob	science	0
Amber	math	2
Amber	science	2

Is this possible? I tried aggregate(subject["math"] ~ student, data = df, length) just to get the first part done, but I get "Error in model.frame.default(formula = subject["math"] ~ : variable lengths differ (found for 'student')".

Thank you in advance!

CodePudding user response：

I've tried a different approach and it's different from your desire output but does that work for you ?

my_df <- data.frame("Student" = c("Amber", "Colin", "Bob", "Amber", "Amber"),
                "Subject" = c("math", "math", "science", "math", "science"),
                stringsAsFactors = FALSE)

my_df <- my_df %>% group_by(Student, Subject) %>% summarise("Total" = n())

CodePudding user response：

library(dplyr)
df_with_count<-df%>%group_by(student,subject)%>%mutate(count=n())

found here: https://www.tutorialspoint.com/how-to-add-a-new-column-in-an-r-data-frame-with-count-based-on-factor-column

CodePudding user response：

I think that you want something like this

library(magrittr)
library(dplyr)

df <- data.frame(
   student = c("Amber", "Colin", "Bob", "Amber", "Amber"),
   subject = c("math", "math", "science", "math", "science")
)

df %>% group_by(student,subject) %>% mutate(`Total math` = n()) %>% filter(`Total math` > 0) %>% filter (subject=="math") %>% distinct -> df2

merge(x=df, y=df2, by="student", all.x = TRUE) %>% mutate(`Total math` = ifelse(!is.na(`Total math`), `Total math`,0)) %>% rename(subject="subject.x") %>% select(student, subject, `Total math`) %>% print