I have 6 columns in my data frame the column names are exam 1, exam 2, exam 3, result exam 1, result exam 2, result exam 3 respectively the first three columns have numbers and NAs and the last three columns have Pass and Fail and NAs. I want to replace all the NAs with 0s and I want to replace instead of all the Pass words with 1s and instead of all the Fail words with 0s. So I want to replace the Fail with zeros and the NAs also with zeros.
I have used multiple approaches in R but I can't make it work.
df[df == 'NA'] <- 0 , df[df == NA] <- 0
df[df$"result exam 1" == "Pass",]$"result exam 1" = 1
df[df$"result exam 1" == "Fail",]$"result exam 1" = 0
None of these codes are working.
Would someone be able to please help with this problem?
Thank you
CodePudding user response:
You really need to get a better grasp of basic R syntax:
- You are putting the subset operator
[
in the wrong place (you need to subset your vector, not the data frame) - You are then using the
$
operator on the result of the previous operation, and that throws an error (that you should have posted) because$
cannot be used on vectors. - You are testing a value for missingness:
x == NA
has no sense: how can you check a non-available value? You must use theis.na()
function.
Here is what you should have done (with just a bit of help from basic R tutorials):
df$exam.results.1[df$exam.results.1 == 1] <- "Pass"
df$exam.results.1[df$exam.results.1 == 0] <- "Fail"
df$exam[is.na(df$exam)] <- 0
CodePudding user response:
Assuming the name of the data frame is dt. Make a vector for the names of result columns
result <- c("result exam 1", "result exam 2", "result exam 3")
dt <- dt %>% mutate_at(result, ~ifelse(.x == "Pass", 1, 0) )
This will replace all "pass" with 1 and rest of fail and NA with 0. For NA s in other columns
dt[c("exam 1", "exam 2", "exam 3")][is.na(dt[c("exam 1", "exam 2", "exam 3")])] <- 0
CodePudding user response:
To do this for multiple columns in one go you can use the following -
cols <- grep('result', names(df))
df[cols][is.na(df[cols])] <- 0
df[cols][df[cols] == 'Fail'] <- 0
df[cols][df[cols] == 'Pass'] <- 1
df