Home > database >  How to replace all kinds of NA (true NA and character strings) in R
How to replace all kinds of NA (true NA and character strings) in R

Time:11-16

I have a large data frame and there are NA in integer or character. I want to replece all kinds of NA to -. There is a smiple example as below. The df like this :

library(tibble)
df = tibble(a = c(1,2, NA),
                b = c('a','b', 'NA'),
                c = c('1.5(1,7, 1.9)', '1.3 (1.4, 1.5)', 'NA (NA, NA)'))

> df
# A tibble: 3 x 3
      a b     c             
  <dbl> <chr> <chr>         
1     1 a     1.5(1,7, 1.9) 
2     2 b     1.3 (1.4, 1.5)
3    NA NA    NA (NA, NA)  

What I expected df should like this :

df_expected = tibble(a = c(1,2, '-'),
                     b = c('a','b', '-'),
                     c = c('1.5(1,7, 1.9)', '1.3 (1.4, 1.5)', '-'))
> df_expected
# A tibble: 3 x 3
  a     b     c             
  <chr> <chr> <chr>         
1 1     a     1.5(1,7, 1.9) 
2 2     b     1.3 (1.4, 1.5)
3 -     -     -            

Any help will be highly appreciated!

CodePudding user response:

You can do:

library(tidyverse)    
df %>%
  mutate(across(everything(), as.character),
         across(everything(), ~if_else(is.na(.) | str_detect(., "NA"), "-", .)))

# A tibble: 3 x 3
  a     b     c             
  <chr> <chr> <chr>         
1 1     a     1.5(1,7, 1.9) 
2 2     b     1.3 (1.4, 1.5)
3 -     -     -             

CodePudding user response:

Another option would be using ifelse and grepl inside a custom function and then call mutate_all.

na_replace <- function(x) {
  
  ifelse(is.na(x) | grepl('NA', x), '-', x)
  
}

df %>% 
  mutate_all(na_replace)

Since mutate_all is superseded and may not receive further updates, you can use mutate alongside with across:

df %>% 
  mutate(across(everything(), na_replace))

Both options give:

## A tibble: 3 × 3
#  a     b     c             
#  <chr> <chr> <chr>         
#1 1     a     1.5(1,7, 1.9) 
#2 2     b     1.3 (1.4, 1.5)
#3 -     -     -             

Edit

In case we have BANANA, we can scape using '\\b' inside grepl:

na_replace <- function(x) {
  
  ifelse(is.na(x) | grepl('\\bNA', x, perl = TRUE), '-', x)
}

df %>% 
  mutate(across(everything(), na_replace))
## A tibble: 3 × 3
#  a     b      c             
#  <chr> <chr>  <chr>         
#1 1     a      1.5(1,7, 1.9) 
#2 2     b      1.3 (1.4, 1.5)
#3 -     BANANA -             
  • Related