I have a set of data where the response to a series of repeated questions is the outcome of interest. Because of this, I'd like to count the number of "I don't know" responses, grouping those counts by respondent ID, and append it as a new column. So basically, I have data that look like this:
ID | response |
---|---|
1 | Yes |
1 | I don't know |
2 | No |
2 | I don't know |
And I want them to look like this:
ID | response | idkcount |
---|---|---|
1 | Yes | 1 |
1 | I don't know | 1 |
2 | No | 1 |
2 | I don't know | 1 |
This is the code I've most recently written:
df$idkcount <- group_by(as_tibble(df$ID)) %>% count(df$response == "I don't know")
But I seem to get an error message no matter what I try with these two commands. What am I missing?
CodePudding user response:
Using group_by
and mutate
you could do:
Note: I slightly altered your example data to a more general case.
df <- data.frame(
ID = c(1L, 1L, 1L, 1L, 2L, 2L),
response = c("Yes", "I don't know", "I don't know", "I don't know", "No", "I don't know")
)
library(dplyr)
df %>%
group_by(ID) %>%
mutate(idkcount = sum(response == "I don't know", na.rm = TRUE)) %>%
ungroup()
#> # A tibble: 6 × 3
#> ID response idkcount
#> <int> <chr> <int>
#> 1 1 Yes 3
#> 2 1 I don't know 3
#> 3 1 I don't know 3
#> 4 1 I don't know 3
#> 5 2 No 1
#> 6 2 I don't know 1
CodePudding user response:
my_df <- data.frame("id" = c(1, 1, 2, 2, 3),
"response" = c("I don't know", "I don't know", "no", "I don't know", "maybe"),
stringsAsFactors = FALSE)
my_df <- my_df %>% group_by(id) %>% mutate(count = length(which(response == "I don't know")))
CodePudding user response:
A possible solution (I am using @stefan's dataset):
library(tidyverse)
df <- data.frame(
ID = c(1L, 1L, 1L, 1L, 2L, 2L),
response = c("Yes", "I don't know", "I don't know", "I don't know", "No", "I don't know")
)
df %>%
count(ID, response, name = "idkcount")
#> ID response idkcount
#> 1 1 I don't know 3
#> 2 1 Yes 1
#> 3 2 I don't know 1
#> 4 2 No 1