Home > Software engineering >  How can I add a column to a data frame with values based on an if statement?
How can I add a column to a data frame with values based on an if statement?

Time:10-16

Suppose I have a data frame as follows:

  ColA  ColB  ColC
1   A    NA    NA
2   B    NA    F
3   C    D     G

I would like to add a column (NewColD) and use an if statement to determine its values. I would first like to check ColC. If the data in ColC is NOT "NA", I want to use ColC. Else if ColB is NOT "NA", I want to use ColB. Else, use what's in ColA.

If this code executes correctly, my new data frame would look like this:

  ColA  ColB  ColC  NewColD
1   A    NA    NA      A
2   B    F     NA      F
3   C    D     G       G

Thank you in advance!

CodePudding user response:

Here is an option using dplyr::coalesce

library(dplyr)
df %>% mutate(NewColD = coalesce(ColC, ColB, ColA))
#  ColA ColB ColC NewColD
#1    A <NA> <NA>       A
#2    B <NA>    F       F
#3    C    D    G       G

Or, more verbosely, using dplyr::case_when

library(dplyr)
df %>%
    mutate(NewColD = case_when(
        !is.na(ColC) ~ ColC,
        !is.na(ColB) ~ ColB,
        TRUE ~ ColA))
#  ColA ColB ColC NewColD
#1    A <NA> <NA>       A
#2    B <NA>    F       F
#3    C    D    G       G

Or in base R using a nested ifelse structure

transform(df, NewColD = ifelse(
    !is.na(ColC), ColC, ifelse(!is.na(ColB), ColB, ColA)))
#  ColA ColB ColC NewColD
#1    A <NA> <NA>       A
#2    B <NA>    F       F
#3    C    D    G       G

Sample data

df <- read.table(text = "  ColA  ColB  ColC
1   A    NA    NA
2   B    NA    F
3   C    D     G", header = T)
  • Related