Home > front end >  Replacing NAs in a single vector of a dataframe
Replacing NAs in a single vector of a dataframe

Time:06-18

I'm trying to replace all of the NAs in a single column of a data frame with a calculated mean value. I have different calculated means for each column, so I need to ensure I'm only replacing NAs in a single column.

Here is the code I'm using:

df$column %>% replace_na(68.9)

I keep getting an error message saying I

can't use $ for an atomic vector, but I don't think I am?

CodePudding user response:

you are actually. The tidyr::replace_na() function works only on a data frame.

df_with_replaced_na <- df %>% replace_na(list(column = 68.9))
# If you had more columns, it would be list(column1 = 67, column2 = 42) etc.

I understand that you want to impute your missing values with the mean of each column.

Here is a general solution

library(tidyverse)
impute_na_with_mean <- function(x) dplyr::coalesce(x, mean(x, na.rm = TRUE))

impute_na_with_mean(c(1, 2, NA))
#> [1] 1.0 2.0 1.5

df %>% 
 mutate(across(where(anyNA) & where(is.numeric), .fns = impute_na_with_mean))
# That would change all your numeric columns with some NAs to this imputation method.
  •  Tags:  
  • r
  • Related