I have a data frame named titanic
with 2021 rows of passengers on the titanic and specific characteristics of each passenger:
Class Sex Age Survived
1 3rd Male Child No
2 3rd Male Child No
3 3rd Male Child No
4 3rd Male Child No
5 3rd Male Child No
6 3rd Male Child No
...
I want to create a function that has multiple arguments that looks something like this:
f1 <- function(sex, age, class, survived){
...
}
where the arguments are where I input some criteria of the passengers. As an example, I want to be able to input criteria into the function such that
f1("Female", "Child","3rd", "Yes")
returns
Class Sex Age Survived
1534 3rd Female Child Yes
1535 3rd Female Child Yes
1536 3rd Female Child Yes
1537 3rd Female Child Yes
1538 3rd Female Child Yes
Now, I have hard-coded it and just used an if else statement to cover all of the possibilities.
function.q6.1 <- function(sex,age,class,survival){
if(sex == "Male" & age == "Child" & class == "3rd" & survival == "No"){
subset(titanic, Sex == "Male" & Age == "Child" & Class == "3rd" & Survived == "No")
}
else if(sex == "Female" & age == "Child" & class == "3rd" & survival == "No"){
subset(titanic, Sex == "Female" & Age == "Child" & Class == "3rd" & Survived == "No")
}
else if(sex == "Male" & age == "Adult" & class == "3rd" & survival == "No"){
subset(titanic, Sex == "Male" & Age == "Adult" & Class == "3rd" & Survived == "No")
}
...
}
I want to know if there is a more efficient way of doing this. Thank you ahead of time.
CodePudding user response:
If you are using a data.frame like shown in your question, you could use
library(dplyr)
my_filter <- function(sex, age, class, survived) {
df %>%
filter(Sex == sex, Age == age, Class == class, Survived == survived)
}
Now my_filter("Female", "Child","3rd", "Yes")
returns
Class Sex Age Survived
7 3rd Female Child Yes
8 3rd Female Child Yes
9 3rd Female Child Yes
10 3rd Female Child Yes
11 3rd Female Child Yes
CodePudding user response:
#toy dataset
set.seed(1912)
titanic <- data.frame(class = sample(c("1st","2nd","3rd"),100,replace = T),
sex = sample(c("Male","Female"),100,replace = T),
age = sample(c("Child","Adult"),100,replace = T),
survival = sample(c("Yes","No"),100,replace = T)
)
f1 <- function(sex,age,class,survival) {
titanic[titanic$class==class&titanic$sex==sex&titanic$age==age&titanic$survival==survival,]
}
f1("Female", "Child","3rd", "Yes")
class sex age survival
11 3rd Female Child Yes
15 3rd Female Child Yes
38 3rd Female Child Yes
71 3rd Female Child Yes
85 3rd Female Child Yes
94 3rd Female Child Yes
CodePudding user response:
This assumes that the first argument is the data frame and the remaining arguments are values for each of the columns in the order that they appear in the data frame.
The mapply compares successive columns to successive argument values returning a logical matrix. The apply returns one logical value per row and then we subscript by that.
We use the data frame shown reproducibly in the Note at the end in the test call.
f1 <- function(dat, ...) {
dat <- na.omit(dat)
dat[apply(mapply(`==`, dat, list(...)), 1, all), ]
}
f1(dat, "3rd", "Male", "Child", "No")
## Class Sex Age Survived
## 1 3rd Male Child No
## 2 3rd Male Child No
## 3 3rd Male Child No
## 4 3rd Male Child No
## 5 3rd Male Child No
## 6 3rd Male Child No
Note
Lines <- "
Class Sex Age Survived
1 3rd Male Child No
2 3rd Male Child No
3 3rd Male Child No
4 3rd Male Child No
5 3rd Male Child No
6 3rd Male Child No"
dat <- read.table(text = Lines)
CodePudding user response:
Maybe another strategy could be:
library(dplyr)
library(stringr)
f1 <- paste(f1, collapse = "|")
my_function <- function(df){
df %>%
select(Sex, Age, Class, Survived) %>%
filter(if_all(everything(), ~str_detect(.,f1))
)
}
my_function(df)
output:
Sex Age Class Survived
1534 Female Child 3rd Yes
1535 Female Child 3rd Yes
1536 Female Child 3rd Yes
1537 Female Child 3rd Yes
1538 Female Child 3rd Yes