In my current data frame, some students have both a math course as well as an algebra or pre-algebra course (in the below example, that is students 2 and 4).
student_courses <-
tribble(
~ student_id, ~class, ~letter_grade,
001, "math", "A",
001, "science", "A",
001, "PE", "C",
002, "math", "B",
002, "algebra", "A",
002, "science", "B",
002, "PE", "B",
003, "math", "B",
003, "science", "B",
003, "PE", "B",
004, "math", "B",
004, "pre-algebra", "B",
004, "science", "B",
004, "PE", "B"
)
I need to filter the data frame in such a way that if a student has both a math and pre-algebra/algebra grade, their math grade will be filtered out. Below is what I need the data frame to look like.
student_courses_answer <-
tribble(
~ student_id, ~class, ~letter_grade,
001, "math", "A",
001, "science", "A",
001, "PE", "C",
002, "algebra", "A",
002, "science", "B",
002, "PE", "B",
003, "math", "B",
003, "science", "B",
003, "PE", "B",
004, "pre-algebra", "B",
004, "science", "B",
004, "PE", "B"
)
I've tried grouping by student_id
and then stacking filters, but this doesn't work. I'm stumped.
Any Ideas?
CodePudding user response:
The condition is !(any(grepl("algebra", class)) & class == "math")
in each ID. The use of grepl("algebra", class)
can match both "algebra" and "pre-algebra".
library(dplyr)
student_courses %>%
group_by(student_id) %>%
filter( !(any(grepl("algebra", class)) & class == "math") ) %>%
ungroup()
# # A tibble: 12 × 3
# student_id class letter_grade
# <dbl> <chr> <chr>
# 1 1 math A
# 2 1 science A
# 3 1 PE C
# 4 2 algebra A
# 5 2 science B
# 6 2 PE B
# 7 3 math B
# 8 3 science B
# 9 3 PE B
# 10 4 pre-algebra B
# 11 4 science B
# 12 4 PE B
CodePudding user response:
A dplyr
approach:
library(dplyr)
student_courses |>
group_by(student_id) |>
mutate(
remove_math = ("math" %in% class) & ("algebra" %in% class | "pre-algebra" %in% class)
) |>
filter(
!(class == "math" & remove_math)
) |>
select(-remove_math) |>
ungroup()
# # A tibble: 12 x 3
# student_id class letter_grade
# <dbl> <chr> <chr>
# 1 1 math A
# 2 1 science A
# 3 1 PE C
# 4 2 algebra A
# 5 2 science B
# 6 2 PE B
# 7 3 math B
# 8 3 science B
# 9 3 PE B
# 10 4 pre-algebra B
# 11 4 science B
# 12 4 PE B
EDIT: Added algebra to the condition (was previously just pre-algebra).