Suppose I have a data frame as follows:
ColA ColB ColC
1 A NA NA
2 B NA F
3 C D G
I would like to add a column (NewColD) and use an if statement to determine its values. I would first like to check ColC. If the data in ColC is NOT "NA", I want to use ColC. Else if ColB is NOT "NA", I want to use ColB. Else, use what's in ColA.
If this code executes correctly, my new data frame would look like this:
ColA ColB ColC NewColD
1 A NA NA A
2 B F NA F
3 C D G G
Thank you in advance!
CodePudding user response:
Here is an option using dplyr::coalesce
library(dplyr)
df %>% mutate(NewColD = coalesce(ColC, ColB, ColA))
# ColA ColB ColC NewColD
#1 A <NA> <NA> A
#2 B <NA> F F
#3 C D G G
Or, more verbosely, using dplyr::case_when
library(dplyr)
df %>%
mutate(NewColD = case_when(
!is.na(ColC) ~ ColC,
!is.na(ColB) ~ ColB,
TRUE ~ ColA))
# ColA ColB ColC NewColD
#1 A <NA> <NA> A
#2 B <NA> F F
#3 C D G G
Or in base R using a nested ifelse
structure
transform(df, NewColD = ifelse(
!is.na(ColC), ColC, ifelse(!is.na(ColB), ColB, ColA)))
# ColA ColB ColC NewColD
#1 A <NA> <NA> A
#2 B <NA> F F
#3 C D G G
Sample data
df <- read.table(text = " ColA ColB ColC
1 A NA NA
2 B NA F
3 C D G", header = T)