If anyone has a moment to help... What I would love to do is the following with the data frame below.
time look category
150 left B1
170 right B1
100 left B1
100 away A1
70 left A1
400 right A1
100 left A1
300 right A2
100 left A2
100 right A2
100 left B1
150 right B1
200 away B1
100 left B1
I would like to produce a new data frame that:
- Removes a standard arbitrary value, for example 200, under the column time
- This subtraction only occurs once, starting at the first instance of a group under category
- This only occurs for groups beginning with A
- For example, looking at A1. If we were to remove 200, this means the first two rows of A1 is removed from the data frame and 30 is removed from 400. Notice the change in the data frame below.
- A2: remove 200 from the first instance of A2 and time which means the 300 becomes 100. No rows were removed because the time was 300.
- The key is that the order remains the same.
It should look like this:
time look category
150 left B1
170 right B1
100 left B1
370 right A1
100 left A1
100 right A2
100 left A2
100 right A2
100 left B1
150 right B1
200 away B1
100 left B1
I have no clue as to how to begin so any insight would be amazing.
Edit #1: We only want to subtract this arb value from groups that begin with A. So groups beginning with B will remain unchanged.
CodePudding user response:
You may try
library(dplyr)
library(data.table)
df %>%
group_by(data.table::rleid( category)) %>%
mutate(ctime = cumsum(time)) %>%
mutate(val1 = ifelse(startsWith(category, "A"),ctime - 200, ctime )) %>%
filter(val1>0) %>%
mutate(time = val1 - ifelse(is.na(lag(val1)), 0, lag(val1))) %>%
ungroup %>%
select(time, look, category)
time look category
<dbl> <chr> <chr>
1 150 left B1
2 170 right B1
3 100 left B1
4 370 right A1
5 100 left A1
6 100 right A2
7 100 left A2
8 100 right A2
9 100 left B1
10 150 right B1
11 200 away B1
12 100 left B1