I'm new to R and have tried looking for the correct way to write this. I apologize if this is a relatively rudimentary question!
all_loan_offers <- function(df){
df$decision <- c()
for(i in 1:nrow(df)) {
income <- df$income[i]
c_record <- df$c_record[i]
years <- df$years[i]
decision <- df$decision[i]
if (income > 80000){
decision[i] <- "Admit"
} else if(income < 40000 & c_record == FALSE){
decision[i] <- "Admit"
} else if(income < 40000 & c_record == TRUE){
decision[i] <- "Admit"
} else if(income >= 40000 & income <= 80000 & years > 3){
decision[i] <- "Admit"
} else {
decision[i] <- "Reject"
}
return(df)
} }
applicants_info <- data.frame(income = c(40000, 80000, 25000, 70000),
c_record = c(F, T, T, F),
years = c(2, 10, 3, 6),
stringsAsFactors = F)
all_loan_offers(applicants_info)
all_loan_offers then returns the applicants_info df, but my objective is to return an updated data frame with a 4th column to "Admit" or "Reject" based on my if-else ladder.
Many thanks - any help at all would be greatly appreciated! :)
CodePudding user response:
You probably don't need the local decision
variable inside the for loop. When you set df$decision <- c()
you are not doing anything, c()
is NULL. You can set the initial value of all rows to "Reject"
and then admit when conditions are met (dropping the last else { ... }
).
all_loan_offers <- function(df) {
df$decision <- "Reject"
for(i in 1:nrow(df)) {
income <- df$income[i]
c_record <- df$c_record[i]
years <- df$years[i]
if (income > 80000){
df$decision[i] <- "Admit"
} else if(income < 40000 & c_record == FALSE){
df$decision[i] <- "Admit"
} else if(income < 40000 & c_record == TRUE){
df$decision[i] <- "Admit"
} else if(income >= 40000 & income <= 80000 & years > 3){
df$decision[i] <- "Admit"
}
}
df
}
applicants_info <- data.frame(income = c(40000, 80000, 25000, 70000),
c_record = c(F, T, T, F),
years = c(2, 10, 3, 6),
stringsAsFactors = F)
all_loan_offers(applicants_info)
#> income c_record years decision
#> 1 40000 FALSE 2 Reject
#> 2 80000 TRUE 10 Admit
#> 3 25000 TRUE 3 Admit
#> 4 70000 FALSE 6 Admit
EDIT: Note that you can omit the for loop entirely, too. R is great because it operates on whole vectors of values. You can create numeric index from a logical vector and insert conformal vectors of values (thanks to "recycling" of the right hand side)
all_loan_offers <- function(df) {
df$decision <- "Reject"
df$decision[which(df$income > 80000)] <- "Admit"
df$decision[which(df$income < 40000 & !df$c_record)] <- "Admit"
df$decision[which(df$income >= 40000 & df$income <= 80000 & df$years > 3)] <- "Admit"
df
}