I want to remove rows in dataframe1 with a date that occurs before the date in dataframe2, based on the specific course.
Dataframe1 that looks like this:
Participant ID | Course | Date |
---|---|---|
1 | BIO 110 | 2022-02-03 |
2 | BIO 110 | 2022-02-01 |
3 | CHEM 110 | 2022-03-15 |
Dataframe 2 looks like this:
Course | Date |
---|---|
BIO 110 | 2022-02-03 |
CHEM 110 | 2022-03-15 |
The desired output:
Participant ID | Course | Date |
---|---|---|
1 | BIO 110 | 2022-02-03 |
3 | CHEM 110 | 2022-03-15 |
CodePudding user response:
You may try the below code
df <- df1 %>% left_join(df2, by='Course') %>% filter(Date.x>=Date.y)
CodePudding user response:
You will need to ensure that the dates are recognized as dates before filtering.
library(tidyverse)
df_1 <- tibble::tribble(
~Participant.ID, ~Course, ~Date,
1L, "BIO 110", "2022-02-03",
2L, "BIO 110", "2022-02-01",
3L, "CHEM 110", "2022-03-15"
)
df_2 <- tibble::tribble(
~Course, ~Date,
"BIO 110", "2022-02-03",
"CHEM 110", "2022-03-15"
) %>%
dplyr::rename(Min_Date = Date)
dplyr::left_join(df_1, df_2, by = "Course") %>%
dplyr::mutate(Date = lubridate::as_date(Date),
Min_Date = lubridate::as_date(Min_Date)) %>%
dplyr::filter(Date >= Min_Date)
CodePudding user response:
Using purrr
we can do:
library(tidyverse)
map2_dfr(df2$Course, df2$Date,
~ df1 %>% filter(Course == .x, Date >= .y))
## # A tibble: 2 x 3
## Participant.ID Course Date
## <int> <chr> <chr>
## 1 1 BIO 110 2022-02-03
## 2 3 CHEM 110 2022-03-15