Home > Software design >  Replace missing values in R dataframe
Replace missing values in R dataframe

Time:03-02

I have data:

Date Price
"2021-01-01" 1
"2021-01-02" NA
"2021-01-03" NA
"2021-01-04" NA
"2021-01-05" NA
"2021-01-06" 6
"2021-01-07" NA
"2021-01-08" NA
"2021-01-09" 3

And I would like to replace missing values with means, so that the end result would look like this:

Date Price
"2021-01-01" 1
"2021-01-02" 2
"2021-01-03" 3
"2021-01-04" 4
"2021-01-05" 5
"2021-01-06" 6
"2021-01-07" 5
"2021-01-08" 4
"2021-01-09" 3

CodePudding user response:

You can use zoo::na.approx:

library(zoo)
na.approx(dat$Price)
# [1] 1 2 3 4 5 6 5 4 3

CodePudding user response:

I consider that you have multiple price cols, where you got the price. Then you want to create a new column named Price which is the mean and without NA values.

library(tidyverse)
library(dplyr)

Date <- c("2021-01-01","2021-01-02","2021-01-03","2021-01-04","2021-01-05",
          "2021-01-06", "2021-01-07", "2021-01-08","2021-01-09", "2021-01-08","2021-01-09")

your.price.col1 <- c(floor(runif(9,0,100)),NA,NA)
your.price.col2 <- c(floor(runif(9,0,100)),33,44)

df <- data.frame(Date, your.price.col1,your.price.col2)

# slice your price cols, which you want to include in the mean with [2:3] for col1 and col2
df %>%
  mutate(Price = rowMeans(df[2:3], na.rm=T))

         Date your.price.col1 your.price.col2 Price
1  2021-01-01              96              55  75.5
2  2021-01-02              22              43  32.5
3  2021-01-03              68              62  65.0
4  2021-01-04              18              51  34.5
5  2021-01-05              27               6  16.5
6  2021-01-06              26              30  28.0
7  2021-01-07              32              22  27.0
8  2021-01-08              53              95  74.0
9  2021-01-09              74              78  76.0
10 2021-01-08              NA              33  33.0
11 2021-01-09              NA              44  44.0

CodePudding user response:

One way would be to use na_interpolation from imputeTS library:

imputeTS::na_interpolation(c(1, NA, NA, 4))
# 1 2 3 4
imputeTS::na_interpolation(c(6, NA, NA, 3))
# 6 5 4 3
  • Related