Home > Software engineering >  Replace NA in multiple columns conditionally and with the value of another column
Replace NA in multiple columns conditionally and with the value of another column

Time:01-25

I am a seldom user of R and have some very basic coding skills. I believe I am completely misunderstanding how to solve this problem. Assuming following data

data <- tibble(a = letters[1:4], b1 = 1:4, b2 = 4:1, 
               b3 = c(NA, 2, NA,4), b4 = c(NA, 1, 3, 6))
data <- data %>%
  mutate(average = rowMeans(select(., starts_with("b")), na.rm = TRUE),
         NA_count = rowSums(is.na(select(., starts_with("b"))))
  )

data

# A tibble: 4 × 7
  a        b1    b2    b3    b4 average NA_count
  <chr> <int> <int> <dbl> <dbl>   <dbl>    <dbl>
1 a         1     4    NA    NA    2.5         2
2 b         2     3     2     1    2           0
3 c         3     2    NA     3    2.67        1
4 d         4     1     4     6    3.75        0
          

I am trying to replace NA with the value in average only if there is one NA on the line (I defined NA_count as helper column). I've tried the below code, but I think I am misunderstanding how to use it, as I get multiple errors. Also, I am trying to find a solution using dplyr and/or the tidyverse package.

data %>%
  mutate(across(starts_with("b"), ~ if_else(NA_count == 1, average, .)))

Thanks for your help !

CodePudding user response:

It is due to the column type difference as b1, b2 are integer type and average is numeric. We may need to use as.numeric on if_else as if_else checks for the type for each of the yes, no output and it needs to be the same (same for case_when, but it can work for ifelse as there is type coersion). In addition, the test condition needs to also check for the NA location with is.na

library(dplyr)
data %>% 
   mutate(across(starts_with("b"), ~ if_else(NA_count ==1 & is.na(.x), 
     average, as.numeric(.x))))

-output

# A tibble: 4 × 7
  a        b1    b2    b3    b4 average NA_count
  <chr> <dbl> <dbl> <dbl> <dbl>   <dbl>    <dbl>
1 a         1     4 NA       NA    2.5         2
2 b         2     3  2        1    2           0
3 c         3     2  2.67     3    2.67        1
4 d         4     1  4        6    3.75        0
  •  Tags:  
  • r
  • Related