Home > Software design >  Replace a character "?" with NA in R
Replace a character "?" with NA in R

Time:06-27

I exported a large database and some data was exported with � since it didn't contain value. The purpose is to calculate the average of each row, however I can't. Also I tried to replace � by NA with

df[ df == "?" ] <- NA 

but it didn't work. How can I achieve the average per row with that character? or else how can i replace it with NA?

Thank you.

CodePudding user response:

Try this

df <- c("1", "3", "4" , "�" , 5 , "�")

df[ df == "\UFFFD" ] <- NA

Output

df

#> [1] "1" "3" "4" NA  "5" NA 

CodePudding user response:

As suggested by Allen Cameron, you can use as.numeric. I will simply show you how to apply that to the columns (since you said it was a large database).

Example data

# A tibble: 5 × 3
     id values values_2
  <int> <chr>  <chr>   
1     1 78     50      
2     2 �      �       
3     3 64     �       
4     4 23     20      
5     5 F      Random  

df %>% 
  mutate(across(2:3, ~ as.numeric(.x)))

# A tibble: 5 × 3
     id values values_2
  <int>  <dbl>    <dbl>
1     1     78       50
2     2     NA       NA
3     3     64       NA
4     4     23       20
5     5     NA       NA

Rowwise mean() calculations, without the irrelevant id column

df %>% 
  mutate(across(2:3, ~ as.numeric(.x))) %>% 
  rowwise() %>% 
  mutate(mean = mean(c_across(2:3), na.rm = TRUE))

# A tibble: 5 × 4
# Rowwise: 
     id values values_2  mean
  <int>  <dbl>    <dbl> <dbl>
1     1     78       50  64  
2     2     NA       NA NaN  
3     3     64       NA  64  
4     4     23       20  21.5
5     5     NA       NA NaN  
  •  Tags:  
  • r na
  • Related