I am trying to replace all values in nat_locx
with the value from the first row in LOCX
and replace all the distance
values with 0 if my first condition is met one or more times for id
(my group_by()
variable), but NOT if my second condition is met one or more times for id
.
Here is an example of my data:
id DATE nat_locx LOCX distance loc_age condition
<fct> <date> <dbl> <dbl> <dbl> <dbl> <lgl>
6553 2004-06-27 13.5 2 487.90 26 TRUE
6553 2004-07-14 13.5 13.5 0 43 FALSE
6553 2004-07-15 13.5 12.5 30 44 FALSE
10160 2005-07-01 4.5 12 229.45588 36 TRUE
10160 2005-07-05 4.5 11 200.12496 40 TRUE
10160 2005-07-06 4.5 11 200.12496 41 TRUE
The way I have tried to do this is like so:
df<-df %>%
group_by(id) %>%
mutate(condition = case_when(
loc_age >= 25 & loc_age < 40 & distance > 30 ~ TRUE,
loc_age>=40 & loc_age<50 & distance>60 ~ TRUE,
TRUE ~ FALSE)) %>%
mutate(nat_locx=if(condition=="TRUE") {
first(LOCX) & distance==0.00
} else {
nat_locx})
The first mutate()
results in a new column with TRUE
and FALSE
values. If there is even one instance of FALSE
, then the if
else
statement I write afterwards should not proceed.
In this example, this would mean that for id==6553
the loop should not change anything. But, because condition==TRUE
for every row for id==10160
then the if
else
should proceed.
Ideally, I'd like this output:
id DATE nat_locx LOCX distance loc_age condition
<fct> <date> <dbl> <dbl> <dbl> <dbl> <lgl>
6553 2004-06-27 13.5 2 487.90 26 TRUE
6553 2004-07-14 13.5 13.5 0 43 FALSE
6553 2004-07-15 13.5 12.5 30 44 FALSE
10160 2005-07-01 12 12 0 36 TRUE
10160 2005-07-05 12 11 0 40 TRUE
10160 2005-07-06 12 11 0 41 TRUE
A dplyr
solution is preferred.
CodePudding user response:
As @Ben mentioned, we can include all
so that the changes are only applied to groups that have all TRUE
. We can use this for both nat_locx
and for the distance
columns.
library(tidyverse)
df %>%
group_by(id) %>%
mutate(
condition = case_when(
loc_age >= 25 & loc_age < 40 & distance > 30 ~ TRUE,
loc_age >= 40 & loc_age < 50 & distance > 60 ~ TRUE,
TRUE ~ FALSE
)
) %>%
mutate(nat_locx = if (all(condition)) first(LOCX) else nat_locx,
distance = if (all(condition)) 0 else distance)
Output
id DATE nat_locx LOCX distance loc_age condition
<int> <chr> <dbl> <dbl> <dbl> <int> <lgl>
1 6553 2004-06-27 13.5 2 488. 26 TRUE
2 6553 2004-07-14 13.5 13.5 0 43 FALSE
3 6553 2004-07-15 13.5 12.5 30 44 FALSE
4 10160 2005-07-01 12 12 0 36 TRUE
5 10160 2005-07-05 12 11 0 40 TRUE
6 10160 2005-07-06 12 11 0 41 TRUE