I have a dataset similar to this:
> dput(df)
structure(list(Person_id = c("A", "A", "A", "A", "A", "B", "B",
"B"), Weight = c(170L, 164L, 160L, 150L, 149L, 250L, 225L, 230L
)), class = "data.frame", row.names = c(NA, -8L))
I want to create a column that indicates if the individual hits 150 lbs or below.
We see that Person A
eventually does reach 150 lbs, so that individual should be marked as a 'yes'--even though they did not hit this threshold the first three observations. Person B
never reaches the 150 lb threshold, so they should be marked as a 'no'.
The desired output should look like this:
> dput(df)
structure(list(Person_id = c("A", "A", "A", "A", "A", "B", "B",
"B"), Weight = c(170L, 164L, 160L, 150L, 149L, 250L, 225L, 230L
), Condition_met = c("Yes", "Yes", "Yes", "Yes", "Yes", "No",
"No", "No")), class = "data.frame", row.names = c(NA, -8L))
CodePudding user response:
Using dplyr
you could do:
library(dplyr)
dat %>%
group_by(Person_id) %>%
mutate(Condition_met = if_else(any(Weight <= 150), "Yes", "no")) %>%
ungroup()
#> # A tibble: 8 × 3
#> Person_id Weight Condition_met
#> <chr> <int> <chr>
#> 1 A 170 Yes
#> 2 A 164 Yes
#> 3 A 160 Yes
#> 4 A 150 Yes
#> 5 A 149 Yes
#> 6 B 250 no
#> 7 B 225 no
#> 8 B 230 no
Or a similar base R approach using ave
:
dat$Condition_met <- ave(dat$Weight, dat$Person_id, FUN = function(x) ifelse(any(x <= 150), "Yes", "No"))
dat
#> Person_id Weight Condition_met
#> 1 A 170 Yes
#> 2 A 164 Yes
#> 3 A 160 Yes
#> 4 A 150 Yes
#> 5 A 149 Yes
#> 6 B 250 No
#> 7 B 225 No
#> 8 B 230 No