I have data:
Date | Price |
---|---|
"2021-01-01" | 1 |
"2021-01-02" | NA |
"2021-01-03" | NA |
"2021-01-04" | NA |
"2021-01-05" | NA |
"2021-01-06" | 6 |
"2021-01-07" | NA |
"2021-01-08" | NA |
"2021-01-09" | 3 |
And I would like to replace missing values with means, so that the end result would look like this:
Date | Price |
---|---|
"2021-01-01" | 1 |
"2021-01-02" | 2 |
"2021-01-03" | 3 |
"2021-01-04" | 4 |
"2021-01-05" | 5 |
"2021-01-06" | 6 |
"2021-01-07" | 5 |
"2021-01-08" | 4 |
"2021-01-09" | 3 |
CodePudding user response:
You can use zoo::na.approx
:
library(zoo)
na.approx(dat$Price)
# [1] 1 2 3 4 5 6 5 4 3
CodePudding user response:
I consider that you have multiple price cols, where you got the price. Then you want to create a new column named Price
which is the mean and without NA
values.
library(tidyverse)
library(dplyr)
Date <- c("2021-01-01","2021-01-02","2021-01-03","2021-01-04","2021-01-05",
"2021-01-06", "2021-01-07", "2021-01-08","2021-01-09", "2021-01-08","2021-01-09")
your.price.col1 <- c(floor(runif(9,0,100)),NA,NA)
your.price.col2 <- c(floor(runif(9,0,100)),33,44)
df <- data.frame(Date, your.price.col1,your.price.col2)
# slice your price cols, which you want to include in the mean with [2:3] for col1 and col2
df %>%
mutate(Price = rowMeans(df[2:3], na.rm=T))
Date your.price.col1 your.price.col2 Price
1 2021-01-01 96 55 75.5
2 2021-01-02 22 43 32.5
3 2021-01-03 68 62 65.0
4 2021-01-04 18 51 34.5
5 2021-01-05 27 6 16.5
6 2021-01-06 26 30 28.0
7 2021-01-07 32 22 27.0
8 2021-01-08 53 95 74.0
9 2021-01-09 74 78 76.0
10 2021-01-08 NA 33 33.0
11 2021-01-09 NA 44 44.0
CodePudding user response:
One way would be to use na_interpolation
from imputeTS
library:
imputeTS::na_interpolation(c(1, NA, NA, 4))
# 1 2 3 4
imputeTS::na_interpolation(c(6, NA, NA, 3))
# 6 5 4 3