Home > database >  Replace numerical value in two columns with NA based on a single other column NA value in R
Replace numerical value in two columns with NA based on a single other column NA value in R

Time:03-02

I have simplified my df to:

A <- c("a", "b", "c", "d", "e", "f", "g", "NA", "h", "I")
B <- c(NA, 2, 3, 4, NA, NA, 5, 6, 8, NA)
C <- c(NA, 9, 8, 4, 5, 7, 5, 6, NA, NA)
D <- c(NA, 1, NA, 3, NA, 5, NA, NA, 8, NA)
E <- c(1,2,3,4,5,6,7,8,9,10)

df <- data.frame(A, B, C, D, E)

I would like to create a general code to change the numerical value of columns B and C based on the NA value of column D. The resulting df2 would be:

A <- c("a", "b", "c", "d", "e", "f", "g", "NA", "h", "I")
B <- c(NA, 2, NA, 4, NA, NA, NA, NA, 8, NA)
C <- c(NA, 9, NA, 4, NA, 7, NA, NA, NA, NA)
D <- c(NA, 1, NA, 3, NA, 5, NA, NA, 8, NA)
E <- c(1,2,3,4,5,6,7,8,9,10)

df2 <- data.frame(A, B, C, D, E)

For my code that isn't working I have so far tried the below which give me the error of "unused argument (as.numeric(B))":

df2 <- df %>% na_if(is.na(D), as.numeric(B)) %>%
  na_if(is.na(D), as.numeric(C))

Any help with be greatly appreciate. I cannot install library(naniar) so please no solution that use replace_with_na_at.

Thank you!

CodePudding user response:

With dplyr, we can apply a simple ifelse statement to both B and C using across and replace with NA when they meet the condition (i.e., D is NA).

library(dplyr)

output <- df %>% 
  mutate(across(B:C, ~ ifelse(is.na(D), NA, .x)))

Output

    A  B  C  D  E
1   a NA NA NA  1
2   b  2  9  1  2
3   c NA NA NA  3
4   d  4  4  3  4
5   e NA NA NA  5
6   f NA  7  5  6
7   g NA NA NA  7
8  NA NA NA NA  8
9   h  8 NA  8  9
10  I NA NA NA 10

Test

identical(output, df2)
# [1] TRUE

CodePudding user response:

Base R

A base R solution with Map and is.na<-.

A <- c("a", "b", "c", "d", "e", "f", "g", "NA", "h", "I")
B <- c(NA, 2, 3, 4, NA, NA, 5, 6, 8, NA)
C <- c(NA, 9, 8, 4, 5, 7, 5, 6, NA, NA)
D <- c(NA, 1, NA, 3, NA, 5, NA, NA, 8, NA)
E <- c(1,2,3,4,5,6,7,8,9,10)

df <- data.frame(A, B, C, D, E)

df[c("B", "C")] <- Map(\(x, y) {
  is.na(x) <- is.na(y)
  x
}, df[c("B", "C")], df["D"])
df
#>     A  B  C  D  E
#> 1   a NA NA NA  1
#> 2   b  2  9  1  2
#> 3   c NA NA NA  3
#> 4   d  4  4  3  4
#> 5   e NA NA NA  5
#> 6   f NA  7  5  6
#> 7   g NA NA NA  7
#> 8  NA NA NA NA  8
#> 9   h  8 NA  8  9
#> 10  I NA NA NA 10

Created on 2022-03-01 by the reprex package (v2.0.1)

dplyr

And a solution with dplyr, but the same is.na<-.

library(dplyr)

df %>%
  mutate(across(B:C, \(x) {is.na(x) <- is.na(D); x}))
#>     A  B  C  D  E
#> 1   a NA NA NA  1
#> 2   b  2  9  1  2
#> 3   c NA NA NA  3
#> 4   d  4  4  3  4
#> 5   e NA NA NA  5
#> 6   f NA  7  5  6
#> 7   g NA NA NA  7
#> 8  NA NA NA NA  8
#> 9   h  8 NA  8  9
#> 10  I NA NA NA 10

Created on 2022-03-01 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related