I'm trying to replace all of the NAs in a single column of a data frame with a calculated mean value. I have different calculated means for each column, so I need to ensure I'm only replacing NAs in a single column.
Here is the code I'm using:
df$column %>% replace_na(68.9)
I keep getting an error message saying I
can't use $ for an atomic vector, but I don't think I am?
CodePudding user response:
you are actually. The tidyr::replace_na()
function works only on a data frame.
df_with_replaced_na <- df %>% replace_na(list(column = 68.9))
# If you had more columns, it would be list(column1 = 67, column2 = 42) etc.
I understand that you want to impute your missing values with the mean of each column.
Here is a general solution
library(tidyverse)
impute_na_with_mean <- function(x) dplyr::coalesce(x, mean(x, na.rm = TRUE))
impute_na_with_mean(c(1, 2, NA))
#> [1] 1.0 2.0 1.5
df %>%
mutate(across(where(anyNA) & where(is.numeric), .fns = impute_na_with_mean))
# That would change all your numeric columns with some NAs to this imputation method.