Home > Blockchain >  All the column NA values in a dataframe fill with median values in R
All the column NA values in a dataframe fill with median values in R

Time:10-18

I need to fill the null values of all the numerical columns with each column's median value in a data frame. I did the following code.

median_forNumericalNulls <- function(dataframe){
  
  nums <- unlist(lapply(dataframe, is.numeric))  
  
  df_num <- dataframe[ , nums]
  
  df_num[] <- lapply(df_num, function(x) { 
    x[is.na(x)] <- median(x, na.rm = TRUE)
    x
  })      

  return(dataframe)
  
}

median_forNumericalNulls(A)

A is the parent table, which consists of both numerical as well as categorical variables. How can I replace the columns of 'A' dataframe with the output of the function median_forNumericalNulls?

Is there a better way that we can do the same?

CodePudding user response:

May be we need to change the function to directly subset the columns and updating the columns, instead of creating another object and then updating

median_forNumericalNulls <- function(dataframe){
  
  nums <- unlist(lapply(dataframe, is.numeric))  
  
  df_num <- dataframe[ , nums]
  
  dataframe[nums] <- lapply(dataframe[nums], function(x) { 
    x[is.na(x)] <- median(x, na.rm = TRUE)
    x
  })      
  dataframe
  
}

-testing

A <- median_forNumericalNulls(A)

Also, this can be done in a compact way with na.aggregate though

library(zoo)
A <- na.aggregate(A, FUN = median)

Or using tidyverse

library(dplyr)
A <- A %>%
   mutate(across(where(is.numeric), 
         ~ replace(., is.na(.), median(., na.rm = TRUE))))

CodePudding user response:

Here is another approach how you could do it: Example:

librara(dplyr)
iris1 <- iris %>% 
  select(1, 2, 5)
head(iris1, 10) %>% 
  as_tibble() %>% 
  mutate(across(where(is.numeric), ~ifelse(.<= 3, NA, .))) %>% 
  mutate(across(where(is.numeric), ~ifelse(is.na(.), median(.,na.rm = TRUE), .)))
   Sepal.Length Sepal.Width Species
          <dbl>       <dbl> <fct>  
 1          5.1         3.5 setosa 
 2          4.9         3.4 setosa 
 3          4.7         3.2 setosa 
 4          4.6         3.1 setosa 
 5          5           3.6 setosa 
 6          5.4         3.9 setosa 
 7          4.6         3.4 setosa 
 8          5           3.4 setosa 
 9          4.4         3.4 setosa 
10          4.9         3.1 setosa 
  •  Tags:  
  • r
  • Related