This is a follow up to my question here.
I have a dataframe like so:
ID <- c('A','A','A','A','A','A','A','A','A','A','A','A','A','A','A','A' )
EXP_P <- c(62,62,62,62,62,62,62,64,64,64,67,67,67,67,67,67)
BRR <- c(61,57,66,53,54,50,55,65,71,53,51,50,58,54,55,57)
val1 <- c(1,1,1,1,1,1,1,2,2,2,3,3,3,3,3,3)
CURR_PR <- c(1,1,1,1,1,1,1,2,2,2,2,3,3,3,3,3)
df <- data.frame(ID, EXP_P, BRR, val1, CURR_PR)
Output:
ID EXP_P BRR val1 CURR_PR
1 A 62 61 1 1
2 A 62 57 1 1
3 A 62 66 1 1
4 A 62 53 1 1
5 A 62 54 1 1
6 A 62 50 1 1
7 A 62 55 1 1
8 A 64 65 2 2
9 A 64 71 2 2
10 A 64 53 2 2
11 A 67 51 3 2
12 A 67 50 3 3
13 A 67 58 3 3
14 A 67 54 3 3
15 A 67 55 3 3
16 A 67 57 3 3
I have now added the new column BPP
that is dependent on each of the other columns using the rules specified in my initial question thanks to that response:
df %>%
group_by(ID, val1) %>%
mutate(occ = row_number()) %>%
group_by(ID) %>%
mutate(
BPP = case_when(
occ == 1 ~ EXP_P,
occ == 2 ~ lag(BRR),
occ == 3 ~ pmax(lag(BRR, 1), lag(BRR, 2), na.rm = TRUE),
TRUE ~ pmax(lag(BRR, 1), lag(BRR, 2), lag(BRR, 3), na.rm = TRUE)
)
) %>%
ungroup()
# # A tibble: 16 × 6
# ID EXP_P BRR val1 occ BPP
# <chr> <dbl> <dbl> <dbl> <int> <dbl>
# 1 A 62 61 1 1 62
# 2 A 62 57 1 2 61
# 3 A 62 66 1 3 61
# 4 A 62 53 1 4 66
# 5 A 62 54 1 5 66
# 6 A 62 50 1 6 66
# 7 A 62 55 1 7 54
# 8 A 64 65 2 1 64
# 9 A 64 71 2 2 65
# 10 A 64 53 2 3 71
# 11 A 67 51 3 1 67
# 12 A 67 50 3 2 51
# 13 A 67 58 3 3 51
# 14 A 67 54 3 4 58
# 15 A 67 55 3 5 58
# 16 A 67 57 3 6 58
Now I have another variable (CURR_PR)
and when this changes, I need the value in BRR
to be the same as it was under the previous value in BRR
.
For example, when CURR_PR
is 1
, the first value for BRR
is 61
. When CURR_PR
changes to 2
, I want BRR
to be the same value as it was for the first BRR
value when CURR_PR
is 1
(61)
.
When CURR_PR
is 2
, the first value for BRR
after the update will be 61
. When CURR_PR
changes to 3
, I want BRR
to be the same value as it was for the first BRR
value when CURR_PR
is 2
(61)
.
Note that this value will update through the calcs shown above in the mutate
and case_when
values, so it won't always be 61
.
Desired output:
ID EXP_P BRR val1 CURR_PR
1 A 62 61 1 1
2 A 62 57 1 1
3 A 62 66 1 1
4 A 62 53 1 1
5 A 62 54 1 1
6 A 62 50 1 1
7 A 62 55 1 1
8 A 64 61 2 2
9 A 64 71 2 2
10 A 64 53 2 2
11 A 67 51 3 2
12 A 67 61 3 3
13 A 67 58 3 3
14 A 67 54 3 3
15 A 67 55 3 3
16 A 67 57 3 3
CodePudding user response:
I was a little confused, basically every time that CURR_PR
change, BRR
will receive the first value of BRR
where CURR_PR
was 1? If so:
Code
df %>%
group_by(CURR_PR) %>%
mutate(aux = row_number()) %>%
ungroup() %>%
mutate(BRR = if_else(aux == 1 & CURR_PR > 1, BRR[aux == 1 & CURR_PR == 1],BRR))
Output
# A tibble: 16 x 6
ID EXP_P BRR val1 CURR_PR aux
<chr> <dbl> <dbl> <dbl> <dbl> <int>
1 A 62 61 1 1 1
2 A 62 57 1 1 2
3 A 62 66 1 1 3
4 A 62 53 1 1 4
5 A 62 54 1 1 5
6 A 62 50 1 1 6
7 A 62 55 1 1 7
8 A 64 61 2 2 1
9 A 64 71 2 2 2
10 A 64 53 2 2 3
11 A 67 51 3 2 4
12 A 67 61 3 3 1
13 A 67 58 3 3 2
14 A 67 54 3 3 3
15 A 67 55 3 3 4
16 A 67 57 3 3 5
CodePudding user response:
We may group by 'ID' and use replace
library(dplyr)
df %>%
group_by(ID) %>%
mutate(BRR = replace(BRR, !duplicated(CURR_PR), first(BRR))) %>%
ungroup
-output
# A tibble: 16 × 5
ID EXP_P BRR val1 CURR_PR
<chr> <dbl> <dbl> <dbl> <dbl>
1 A 62 61 1 1
2 A 62 57 1 1
3 A 62 66 1 1
4 A 62 53 1 1
5 A 62 54 1 1
6 A 62 50 1 1
7 A 62 55 1 1
8 A 64 61 2 2
9 A 64 71 2 2
10 A 64 53 2 2
11 A 67 51 3 2
12 A 67 61 3 3
13 A 67 58 3 3
14 A 67 54 3 3
15 A 67 55 3 3
16 A 67 57 3 3
If we are using the devel
version of dplyr
, can use .by
in mutate
df %>%
mutate(BRR = replace(BRR, !duplicated(CURR_PR), first(BRR)), .by = "ID")
-output
ID EXP_P BRR val1 CURR_PR
1 A 62 61 1 1
2 A 62 57 1 1
3 A 62 66 1 1
4 A 62 53 1 1
5 A 62 54 1 1
6 A 62 50 1 1
7 A 62 55 1 1
8 A 64 61 2 2
9 A 64 71 2 2
10 A 64 53 2 2
11 A 67 51 3 2
12 A 67 61 3 3
13 A 67 58 3 3
14 A 67 54 3 3
15 A 67 55 3 3
16 A 67 57 3 3