I am working in R and I want to iterate over every unique/distinct name in this table and if A=="yes" | B=="yes" it should create another column C==TRUE for all entries with the same Name, else C==FALSE. I dont know how to combine a for loop with this if statement, I am always getting error messages, although it should be a simple task to do...
Name | A | B |
---|---|---|
Jordan | yes | no |
Pascal | yes | no |
Nando | no | yes |
Nando | no | no |
Nico | no | no |
Nico | no | no |
This should be the result:
Name | A | B | C |
---|---|---|---|
Jordan | yes | no | TRUE |
Pascal | yes | no | TRUE |
Nando | no | yes | TRUE |
Nando | no | no | TRUE |
Nico | no | no | FALSE |
Nico | no | no | FALSE |
CodePudding user response:
For-loops are often not needed in R.
library(dplyr)
dat |>
group_by(Name) |>
mutate(C = if_else("yes" %in% c(A, B), TRUE, FALSE))
#> # A tibble: 6 x 4
#> # Groups: Name [4]
#> Name A B C
#> <chr> <chr> <chr> <lgl>
#> 1 Jordan yes no TRUE
#> 2 Pascal yes no TRUE
#> 3 Nando no yes TRUE
#> 4 Nando no no TRUE
#> 5 Nico no no FALSE
#> 6 Nico no no FALSE
Created on 2022-07-05 by the reprex package (v2.0.1)
CodePudding user response:
R is a language that prefers vector calculations over loops, so the more common way in R would be
df <- data.frame(
Name = c("Jordan","Pascal","Nando","Nando","Nico","Nico"),
A = c("yes","yes","no","no","no","no"),
B = c("no","no","yes","no","no","no")
)
df$C <- df$Name %in% df$Name[df$A == "yes" | df$B == "yes"]
This solution does not rely on any additional library.
If you feel strongly about looping, you could loop over unique(df$Name)
or you could aggregate by df$Name
, but all of those are much more involved and inefficient techniques.