Home > Software design >  When value is negative, subtract result from previous value
When value is negative, subtract result from previous value

Time:12-13

I have a dataset that logically shouldn't have any negative values, but because of revisions to the data, it sometimes does. To adjust for this, I want to distribute negative values to the previous day's data.

For example, I'd like the values to change like so:

Old New
18 18
12 10
1 0
0 0
0 0
-3 0
1 1
0 0
0 0

I'm trying to do this programmatically in R, but haven't been able to make it work.

Here's my latest attempt, which kinda works, but only partially and feels unstable:

library(tidyverse)
library(lubridate)

# Data
nums <- as_tibble(list(
  value = c(18, 12, 1, 0, 0, -3, 1, 0, 0 ),
  date = c("2022-01-01","2022-01-02","2022-01-03",
           "2022-01-04","2022-01-05","2022-01-06",
           "2022-01-07","2022-01-08","2022-01-09"))) |> 
  mutate(date = ymd(date)) %>%
  select(date,value)

# Remove negatives
no_negs <- nums |> 
  arrange(desc(date)) |> 
  mutate(neww = if_else(lag(value,1) < 0, value   lag(value,1),value)) |>
  arrange(date) |> 
  mutate(neww = ifelse(is.na(neww), 0, neww))

Any thoughts on whether this is possible, and if so, how?

Thanks!

Edit: new example data

CodePudding user response:

Edits: reduction of code, fix for consecutive zeroes, use second data set, general improvement.


We can Reduce it:

nums1 %>%
  mutate(
    add = Reduce(function(prev, this) min(this prev, 0),
                 value, init = 0, accumulate = TRUE, right = TRUE)[-1], 
    value2 = pmax(value   add, 0)
  ) %>%
  select(-add)
# # A tibble: 6 x 3
#   date       value value2
#   <date>     <dbl>  <dbl>
# 1 2022-01-01    18     18
# 2 2022-01-02    12      5
# 3 2022-01-03    -3      0
# 4 2022-01-04    -4      0
# 5 2022-01-05     5      3
# 6 2022-01-06    -2      0

and the second dataset with more consecutive zeroes:

nums2 %>%
  mutate(
    add = Reduce(function(prev, this) min(this prev, 0),
                 value, init = 0, accumulate = TRUE, right = TRUE)[-1], 
    value2 = pmax(value   add, 0)
  ) %>%
  select(-add)
# # A tibble: 9 x 3
#   date       value value2
#   <date>     <dbl>  <dbl>
# 1 2022-01-01    18     18
# 2 2022-01-02    12     10
# 3 2022-01-03     1      0
# 4 2022-01-04     0      0
# 5 2022-01-05     0      0
# 6 2022-01-06    -3      0
# 7 2022-01-07     1      1
# 8 2022-01-08     0      0
# 9 2022-01-09     0      0

Data

nums1 <- structure(list(date = structure(c(18993, 18994, 18995, 18996, 18997, 18998), class = "Date"), value = c(18, 12, -3, -4, 5, -2)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
nums2 <- structure(list(date = structure(c(18993, 18994, 18995, 18996, 18997, 18998, 18999, 19000, 19001), class = "Date"), value = c(18, 12, 1, 0, 0, -3, 1, 0, 0)), row.names = c(NA, -9L), class = c("tbl_df", "tbl", "data.frame"))
  • Related