I'm trying to transform a data.frame in R by comparing its observations with values of a list. This means if the list says: "'1' is correct." Then every other value should be wrong or N.A.
As an example I created a data.frame, which contains 3 different Variables for 3 observations:
dat <- data.frame("Q" = c("a", "b", "a"),
"P" = c(1, 2, 4),
"R" = c("d", NA, "d"))
For each variable I defined a correct answer and wrote these in a list:
results <- list("a", 2, "d")
So for Variable Q only a would be correct, for P only 2 and therefor d is the correct answer for R. As i want to create a dataset with Dummy-Variables, the result should look like:
[,1] [,2] [,3]
[1,] "Yes" "No" "Yes"
[2,] "No" "Yes" "No"
[3,] "Yes" NA "Yes"
I tried to create a loop, but the result is not as expected:
mylist <- list()
for (j in 1:3) {
vec <- character(3)
for (i in 1:3) {
ifelse(dat[i,j] == results[j], vec[j] <- "Yes",
ifelse((is.na(dat[i,j]) == TRUE), vec[j] <- NA, vec[j] <- "No"))
}
mylist[[j]] <- vec
}
df <- do.call("rbind",mylist)
[,1] [,2] [,3]
[1,] "Yes" "" ""
[2,] "" "No" ""
[3,] "" "" "Yes"
I am very thankful for all of your answers :)
CodePudding user response:
mapply(`==`, dat, results)
Q P R
[1,] TRUE FALSE TRUE
[2,] FALSE TRUE NA
[3,] TRUE FALSE TRUE
Or, to get your expected output, use t
:
t(mapply(`==`, dat, results))
[,1] [,2] [,3]
Q TRUE FALSE TRUE
P FALSE TRUE FALSE
R TRUE NA TRUE
CodePudding user response:
out <- apply(dat,1,FUN = \(x) x==results)
out2 <- out
out2[out] <- "Yes"
out2[!out] <- "No"
gives
> out2
[,1] [,2] [,3]
Q "Yes" "No" "Yes"
P "No" "Yes" "No"
R "Yes" NA "Yes"
CodePudding user response:
Here is a dplyr
solution:
library(dplyr)
dat %>%
mutate(across(everything(), ~case_when(. %in% results ~ "yes",
!(. %in% results) ~ "no",
TRUE ~ NA_character_)))
Q P R
1 yes no yes
2 no yes no
3 yes no yes