I'm looking for elegant code which filter rows that meet two conditions in a differential way, to a new df:
filter (birth_year==1987 & graduation_year==2005)
(birth_year==1990 & graduation_year==2008)
(birth_year==1993 & graduation_year==2011)
(birth_year==1998 & graduation_year==2016)
#this is not a code
Here is my df:
birth_year <- c(1987,1987,1987,1990,1990,1990,1993,1993,1998,1998)
graduation_year <- c(2005,2005,2006,2008,2007,2008,2011,2012,2017,2016)
grade<-c(56,101,85,120,75,96,85,68,91,105)
df <- data.frame(birth_year, graduation_year, grade)
> df
birth_year graduation_year grade
1 1987 2005 56
2 1987 2005 101
3 1987 2006 85
4 1990 2008 120
5 1990 2007 75
6 1990 2008 96
7 1993 2011 85
8 1993 2012 68
9 1998 2017 91
10 1998 2016 105
the result/the new df should be:
birth_year graduation_year grade
1 1987 2005 56
2 1987 2005 101
3 1990 2008 120
4 1990 2008 96
5 1993 2011 85
6 1998 2016 105
CodePudding user response:
In this specific case I think you could do it like this:
filter(graduation_year - birth_year == 18)
Another idea is creading a reference dataframe with your conditions like this:
birth_year <- c(1987, 1990, 1993, 1998)
graduation_year <- c(2005, 2008, 2011, 2016)
conditions <- data.frame(birth_year, graduation_year)
And then applying inner_join
:
df %>% inner_join(conditions)
Joining, by = c("birth_year", "graduation_year")
birth_year graduation_year grade
1 1987 2005 56
2 1987 2005 101
3 1990 2008 120
4 1990 2008 96
5 1993 2011 85
6 1998 2016 105
CodePudding user response:
You can use the %in%
operator
library(dplyr)
new_df <- df %<%
filter(birth_year %in% c(1987, 1990, 1993, 1998),
graduation_year %in% c(2005, 2008, 2011, 2016)
)
CodePudding user response:
You can simply separate your conditions with |
so that you keep all of the records that meet either of them.
library(dplyr)
new_df <- df %>% filter(birth_year==1987 & graduation_year==2005 |
birth_year==1990 & graduation_year==2008 |
birth_year==1993 & graduation_year==2011 |
birth_year==1998 & graduation_year==2016)