Home > Software design >  How to sum values in one column based on values in other columns R?
How to sum values in one column based on values in other columns R?

Time:10-16

I have a data set with hundreds of participant & control responses to 26 questions. Each participant has 26 questions associated with them where they answered Yes (1), No (-1), Maybe (0), or did not answer (NA)

For each participant, I want to sum all of their specific responses for all 26 questions and save it to a new column. So if they answered Yes (1) 12 out of 26 times, then the new column should have the number 12 in it -- ignoring the No (-1) values.

I have tried for loops, if else statements, sub setting, group by and sum, etc. I just can’t figure out how to loop through each of the 26 questions and sum only theirs — ignoring the other participants.

Edit: Here is a representative example of what the code would look like.

      ID PatientResponse ControlResponse QuestionNumber
1 122047               1               0              1
2 123274              -1              -1              1
3 186223               1               1              1
4 122047               0              -1              2
5 123274               1              -1              2
6 186223              -1               0              2

Here is an image of what one question looks like for various participants: https://i.stack.imgur.com/ojGGO.png

Here is what I would like it to ideally look like after all 26 questions have been summed for each participant : https://i.stack.imgur.com/W6Qo3.png

CodePudding user response:

library(dplyr); library(tidyr)

# this will give the count of each kind of response in its own column
df %>%
  count(Question, Participant, Control) %>%
  pivot_wider(names_from = Control, values_from = n)

#if you just want Yes's counted
df %>%
  group_by(Question, Participant) %>%
  summarize(Summed_Yes_Responses = sum(Control == 1, na.rm = TRUE)) 
  • Related