I have two columns of data and I want to subtract one from the previous row entry of the other. The lag()
function in dplyr is perfect for this, but unfortunately I have NAs in the second row of data. This means that when calculating the new value for rows when the previous row has an NA in the second column is resulting in NA's (see below table):
Value 1 | Value 2 | New Value |
---|---|---|
NA | 2 | NA |
13 | 3 | 11 |
6 | NA | 3 |
6 | 4 | NA |
5 | 4 | 1 |
Is there a way to omit then NAs and have it use the last actual value in the second column instead? Like an na.rm=TRUE
that can be added? The result would look like:
Value 1 | Value 2 | New Value |
---|---|---|
NA | 2 | NA |
13 | 3 | 11 |
6 | NA | 3 |
6 | 4 | 3 |
5 | 4 | 1 |
CodePudding user response:
One option would be to mutate Value 2 beforehand and fill the NA with the last non-NA value:
df %>%
tidyr::fill(`Value 2`, .direction = "down")
CodePudding user response:
After making that New.Value, you can fill the NA
with previous value using fill()
from tidyr package.
library(dplyr)
library(tidyr)
df <- tibble::tribble(
~Value.1, ~Value.2, ~New.Value,
NA, 2, NA,
13,3,11,
6,NA,3,
6,4,NA,
5,4,1,
)
df
#> # A tibble: 5 × 3
#> Value.1 Value.2 New.Value
#> <dbl> <dbl> <dbl>
#> 1 NA 2 NA
#> 2 13 3 11
#> 3 6 NA 3
#> 4 6 4 NA
#> 5 5 4 1
df %>%
fill(New.Value, .direction = "down")
#> # A tibble: 5 × 3
#> Value.1 Value.2 New.Value
#> <dbl> <dbl> <dbl>
#> 1 NA 2 NA
#> 2 13 3 11
#> 3 6 NA 3
#> 4 6 4 3
#> 5 5 4 1
Created on 2022-07-08 by the reprex package (v2.0.1)