Home > other >  How to determine the cumulative sum of one column, only if there is a value present in a another col
How to determine the cumulative sum of one column, only if there is a value present in a another col

Time:08-30

I'd like to calculate a 5-day and 3-day cumulative sum of precipitation preceding a day containing a nitrate concentration. I have a precipitation value for each day and monthly (approximately) nitrate concentrations.

Thanks for the help!

The final output should like this: [1]: https://i.stack.imgur.com/9pYxh.png

Subset of data containing 2 days with nitrate concentrations:

Date PPT NO3
7/1/1991 0 NA
7/2/1991 0 NA
7/3/1991 0 NA
7/4/1991 0 NA
7/5/1991 0 NA
7/6/1991 0 NA
7/7/1991 0 NA
7/8/1991 0 NA
7/9/1991 0 NA
7/10/1991 0 NA
7/11/1991 0 NA
7/12/1991 1.4 NA
7/13/1991 6 NA
7/14/1991 0.6 NA
7/15/1991 9 3.2
7/16/1991 0 NA
7/17/1991 2.2 NA
7/18/1991 0 NA
7/19/1991 0 NA
7/20/1991 3.4 NA
7/21/1991 3 NA
7/22/1991 0 NA
7/23/1991 0 NA
7/24/1991 3 NA
7/25/1991 0 NA
7/26/1991 0 NA
7/27/1991 0.2 NA
7/28/1991 0 NA
7/29/1991 0.4 NA
7/30/1991 0 NA
7/31/1991 0 NA
8/1/1991 0 NA
8/2/1991 0 NA
8/3/1991 0 NA
8/4/1991 0 NA
8/5/1991 0 NA
8/6/1991 0 NA
8/7/1991 1.4 NA
8/8/1991 4.6 NA
8/9/1991 27.4 NA
8/10/1991 23.8 NA
8/11/1991 4 NA
8/12/1991 5 NA
8/13/1991 0 NA
8/14/1991 0 NA
8/15/1991 0 3.09

CodePudding user response:

Does this approach solve your problem?

library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union

df <- structure(list(Date = c("7/1/1991", "7/2/1991", "7/3/1991", "7/4/1991", 
                              "7/5/1991", "7/6/1991", "7/7/1991", "7/8/1991", "7/9/1991", "7/10/1991", 
                              "7/11/1991", "7/12/1991", "7/13/1991", "7/14/1991", "7/15/1991", 
                              "7/16/1991", "7/17/1991", "7/18/1991", "7/19/1991", "7/20/1991", 
                              "7/21/1991", "7/22/1991", "7/23/1991", "7/24/1991", "7/25/1991", 
                              "7/26/1991", "7/27/1991", "7/28/1991", "7/29/1991", "7/30/1991", 
                              "7/31/1991", "8/1/1991", "8/2/1991", "8/3/1991", "8/4/1991", 
                              "8/5/1991", "8/6/1991", "8/7/1991", "8/8/1991", "8/9/1991", "8/10/1991", 
                              "8/11/1991", "8/12/1991", "8/13/1991", "8/14/1991", "8/15/1991"
), PPT = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1.4, 6, 0.6, 9, 0, 
           2.2, 0, 0, 3.4, 3, 0, 0, 3, 0, 0, 0.2, 0, 0.4, 0, 0, 0, 0, 0, 
           0, 0, 0, 1.4, 4.6, 27.4, 23.8, 4, 5, 0, 0, 0), 
NO3 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.2, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3.09)), 
class = "data.frame", row.names = c(NA, -46L))

df %>%
  summarise(Date = Date,
            NO3 = NO3,
            `5-day cumsum` = ifelse(!is.na(NO3), reduce(map(0:4, ~lag(PPT, ., 0)), ` `), NA),
            `3-day cumsum` = ifelse(!is.na(NO3), reduce(map(0:2, ~lag(PPT, ., 0)), ` `), NA)) %>%
  na.omit()
#>         Date  NO3 5-day cumsum 3-day cumsum
#> 15 7/15/1991 3.20           17         15.6
#> 46 8/15/1991 3.09            9          0.0

Created on 2022-08-30 by the reprex package (v2.0.1)

  • Related