Home > Back-end >  Subtract a value based on the first instance of a group in another column
Subtract a value based on the first instance of a group in another column

Time:10-06

If anyone has a moment to help... What I would love to do is the following with the data frame below.

time      look       category
150       left       B1
170       right      B1
100       left       B1
100       away       A1
70        left       A1
400       right      A1
100       left       A1
300       right      A2
100       left       A2
100       right      A2
100       left       B1
150       right      B1
200       away       B1
100       left       B1

I would like to produce a new data frame that:

  • Removes a standard arbitrary value, for example 200, under the column time
  • This subtraction only occurs once, starting at the first instance of a group under category
  • This only occurs for groups beginning with A
  • For example, looking at A1. If we were to remove 200, this means the first two rows of A1 is removed from the data frame and 30 is removed from 400. Notice the change in the data frame below.
  • A2: remove 200 from the first instance of A2 and time which means the 300 becomes 100. No rows were removed because the time was 300.
  • The key is that the order remains the same.

It should look like this:

time      look       category
150       left       B1
170       right      B1
100       left       B1
370       right      A1
100       left       A1
100       right      A2
100       left       A2
100       right      A2
100       left       B1
150       right      B1
200       away       B1
100       left       B1

I have no clue as to how to begin so any insight would be amazing.

Edit #1: We only want to subtract this arb value from groups that begin with A. So groups beginning with B will remain unchanged.

CodePudding user response:

You may try

library(dplyr)
library(data.table)

df %>%
  group_by(data.table::rleid( category)) %>%
  mutate(ctime = cumsum(time)) %>%
  mutate(val1 = ifelse(startsWith(category, "A"),ctime - 200, ctime )) %>%
  filter(val1>0) %>%
  mutate(time = val1 - ifelse(is.na(lag(val1)), 0, lag(val1))) %>%
  ungroup %>%
  select(time, look, category)

    time look  category
   <dbl> <chr> <chr>   
 1   150 left  B1      
 2   170 right B1      
 3   100 left  B1      
 4   370 right A1      
 5   100 left  A1      
 6   100 right A2      
 7   100 left  A2      
 8   100 right A2      
 9   100 left  B1      
10   150 right B1      
11   200 away  B1      
12   100 left  B1
  •  Tags:  
  • r
  • Related