Home > Enterprise >  Subtract above number that is not NA in r
Subtract above number that is not NA in r

Time:10-15

I have a table similar to this minimal example without the difference column:

trigger values difference
0 3
0 NA
0 NA
1 5 2
0 4
0 NA
1 10 6

I want to subtract the above number (and leave out the NAs) from the number at each trigger point (trigger = 1)

Is there a way to do this in R?

CodePudding user response:

Create a grouping column with cumsum on the 'trigger' and taking the lag, then do the difference between the first and last element and replace it as the last value per group

library(dplyr)
df1 %>%
    group_by(grp = lag(cumsum(trigger), default = 0)) %>% 
    mutate(difference = replace(rep(NA, n()), n(),
         values[n()] - values[1])) %>%
    ungroup %>%
    select(-grp)

-output

# A tibble: 7 × 3
  trigger values difference
    <int>  <int>      <int>
1       0      3         NA
2       0     NA         NA
3       0     NA         NA
4       1      5          2
5       0      4         NA
6       0     NA         NA
7       1     10          6

data

df1 <- structure(list(trigger = c(0L, 0L, 0L, 1L, 0L, 0L, 1L), values = c(3L, 
NA, NA, 5L, 4L, NA, 10L)), class = "data.frame", row.names = c(NA, 
-7L))

CodePudding user response:

# Import data: df => data.frame
df <- structure(list(trigger = c(0L, 0L, 0L, 1L, 0L, 0L, 1L), values = c(3L, 
NA, NA, 5L, 4L, NA, 10L), diff_col = c(NA, NA, NA, 2L, -1L, NA, 
6L)), row.names = c(NA, -7L), class = "data.frame")

# Create an empty vector: diff_col => integer vector
df$diff_col <- NA_integer_

# Difference the X.values vector, ignoring NAs: 
# diff_col => integer vector
df[which(!(is.na(df$values)))[-1], "diff_col"] <- diff(
   na.omit(
      df$values
   )
)

# Nullify the value if the trigger is 0:
# diff_col => integer vector
df$diff_col <- with(
   df, 
   ifelse(
      trigger == 0,
      NA_integer_,
      diff_col
   )
)
  • Related