I have a column with numeric data with NA and also ending on NA:
df <- data.frame(
Diam_av = c(12.3, 13, 15.5, NA, NA, NA, NA, 13.7, NA, NA, NA, 9.98, 4,0, 8.76, NA, NA, NA)
)
I want to interpolate the missing values. This works fine with zoo
's function na.approx
as long as there are positive boundary values to interpolate from but it fails if, as in my case, one of the boundary values is NA (at the end of the column Daim_av
:
library(zoo)
df %>%
mutate(Diam_intpl = na.approx(Diam_av))
Error: Problem with `mutate()` input `Diam_intpl`.
x Input `Diam_intpl` can't be recycled to size 18.
ℹ Input `Diam_intpl` is `na.approx(Diam_av)`.
ℹ Input `Diam_intpl` must be size 18 or 1, not 15.
Any idea how to exclude/neutralize column-final NA values?
CodePudding user response:
Add na.rm=F
to remove the error message. Add rule=2
to get the value from the last non-NA value.
df %>%
mutate(Diam_intpl = na.approx(Diam_av, na.rm=F),
Diam_intpl2 = na.approx(Diam_av, na.rm=F, rule=2))
Diam_av Diam_intpl Diam_intpl2
1 12.30 12.30 12.30
2 13.00 13.00 13.00
3 15.50 15.50 15.50
4 NA 15.14 15.14
5 NA 14.78 14.78
6 NA 14.42 14.42
7 NA 14.06 14.06
8 13.70 13.70 13.70
9 NA 12.77 12.77
10 NA 11.84 11.84
11 NA 10.91 10.91
12 9.98 9.98 9.98
13 4.00 4.00 4.00
14 0.00 0.00 0.00
15 8.76 8.76 8.76
16 NA NA 8.76
17 NA NA 8.76
18 NA NA 8.76
CodePudding user response:
If I understand well, you can replace NAs with imputeTS::na_interpolation()
, that has many options:
library(imputeTS)
df$interpolated <- na_interpolation(df,option = 'linear')$Diam_av
Diam_av interpolated
1 12.30 12.30
2 13.00 13.00
3 15.50 15.50
4 NA 15.14
5 NA 14.78
6 NA 14.42
7 NA 14.06
8 13.70 13.70
9 NA 12.77
10 NA 11.84
11 NA 10.91
12 9.98 9.98
13 4.00 4.00
14 0.00 0.00
15 8.76 8.76
16 NA 8.76
17 NA 8.76
18 NA 8.76