I am trying to create a new column (TRUE/FALSE) based on the values of three other columns (also TRUE/FALSE). This is a small sample.
DF
ID 1 2 3
1 TRUE NA NA
2 FALSE TRUE NA
3 TRUE TRUE NA
4 TRUE FALSE NA
5 TRUE FALSE TRUE
6 FALSE FALSE TRUE
7 TRUE FALSE FALSE
8 FALSE NA NA
9 NA NA NA
Not all three columns have the same amount of rows. (therefor a lot of NA values). The data were files that were checked and in column 2 some of them were re-checked and in column 3 an even smaller portion was again re-checked.
So Column 2 overrides column 1, and column 3 overrides column 1 and 2.
I would like the following output
ID 1 2 3 4
1 TRUE NA NA TRUE
2 FALSE TRUE NA TRUE
3 TRUE TRUE NA TRUE
4 TRUE FALSE NA FALSE
5 TRUE FALSE TRUE TRUE
6 FALSE FALSE TRUE TRUE
7 TRUE FALSE FALSE FALSE
8 FALSE NA NA FALSE
9 NA NA NA FALSE
Classes are character so ive tried it with the function ifelse and grepl
DF$4 = ifelse(
grepl("TRUE", DF$1) |
grepl("TRUE", DF$2 |
grepl("TRUE", DF$3), "TRUE","FALSE" ))
But this only allows me to give certain conditions for the option TRUE. I dont know how to implement if column 2 shows FALSE en column 1 shows TRUE to get the output FALSE.
I've tried to search stackoverflow for a similar question/answer, (there probably is) but i couldnt find it therefor i would like to ask it here. Thank you in advance.
CodePudding user response:
You can take the last non-NA values for each rows:
df$v4 <- apply(df[, -1], 1, FUN = \(x) ifelse(all(is.na(x)), FALSE, unlist(tail(x[!is.na(x)], 1))))
#> df$v4
#[1] TRUE TRUE TRUE FALSE TRUE TRUE FALSE FALSE FALSE
Note: \
can replace function
in lambda-like functions since R 4.1.