Home > database >  Is there an R function that can aggregate the count of a specific row in a categorical column?
Is there an R function that can aggregate the count of a specific row in a categorical column?

Time:12-02

I hope everyone is doing well. I am having a bit of a brain fart trying to aggregate in R. Lets say I have this df:

student subject
Amber math
Colin math
Bob science
Amber math
Amber science

And I want to get a count of the number of times the student's subject is math and add that to the data frame, so the result would look like this:

student subject total 'math'
Amber math 2
Colin math 1
Bob science 0
Amber math 2
Amber science 2

Is this possible? I tried aggregate(subject["math"] ~ student, data = df, length) just to get the first part done, but I get "Error in model.frame.default(formula = subject["math"] ~ : variable lengths differ (found for 'student')".

Thank you in advance!

CodePudding user response:

I've tried a different approach and it's different from your desire output but does that work for you ?

my_df <- data.frame("Student" = c("Amber", "Colin", "Bob", "Amber", "Amber"),
                "Subject" = c("math", "math", "science", "math", "science"),
                stringsAsFactors = FALSE)

my_df <- my_df %>% group_by(Student, Subject) %>% summarise("Total" = n())

CodePudding user response:

library(dplyr)
df_with_count<-df%>%group_by(student,subject)%>%mutate(count=n())

found here: https://www.tutorialspoint.com/how-to-add-a-new-column-in-an-r-data-frame-with-count-based-on-factor-column

CodePudding user response:

I think that you want something like this

library(magrittr)
library(dplyr)

df <- data.frame(
   student = c("Amber", "Colin", "Bob", "Amber", "Amber"),
   subject = c("math", "math", "science", "math", "science")
)

df %>% group_by(student,subject) %>% mutate(`Total math` = n()) %>% filter(`Total math` > 0) %>% filter (subject=="math") %>% distinct -> df2

merge(x=df, y=df2, by="student", all.x = TRUE) %>% mutate(`Total math` = ifelse(!is.na(`Total math`), `Total math`,0)) %>% rename(subject="subject.x") %>% select(student, subject, `Total math`) %>% print

  • Related