I have a task that's becoming quite difficult for me.
I have to create a variable (pr_test_1) to test whether a variable for a procedure (I10_PR1) is in a list of procedures, and this code is working great:
df <- df %>%
mutate(pr_test_1=ifelse(I10_PR1 %in% abl_pr, 1,0))
However, I have 25 variables for procedures (I10_PR1 to I10_PR25) and I have to create one for each (pr_test_1 to pr_test_25).
I don't seem to find the right syntax to get a for loop to work.
Any help will be greatly appreciated!
CodePudding user response:
dplyr::across
allows you to apply a function to multiple columns as specified with a selector (the below uses the starts_with
selector).
library(dplyr)
library(purrr)
# sample data
df <- tibble::tibble(
I10_PR1 = sample(100),
I10_PR2 = sample(100),
I10_PR3 = sample(100),
I10_PR4 = sample(100)
)
# a sample list of values to compare against
match_list <- sample(10)
df %>%
mutate(
across(
starts_with("I10_PR"),
~ if_else(.x %in% match_list, 1, 0),
.names = "pr_test_{.col}"
)
)
#> # A tibble: 100 × 8
#> I10_PR1 I10_PR2 I10_PR3 I10_PR4 pr_test_I10_PR1 pr_test_I10…¹ pr_te…² pr_te…³
#> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 93 45 47 46 0 0 0 0
#> 2 91 89 90 76 0 0 0 0
#> 3 16 32 30 24 0 0 0 0
#> 4 66 26 46 41 0 0 0 0
#> 5 53 51 79 9 0 0 0 1
#> 6 36 64 61 32 0 0 0 0
#> 7 45 75 23 25 0 0 0 0
#> 8 86 61 77 52 0 0 0 0
#> 9 17 87 64 53 0 0 0 0
#> 10 6 42 57 33 1 0 0 0
#> # … with 90 more rows, and abbreviated variable names ¹pr_test_I10_PR2,
#> # ²pr_test_I10_PR3, ³pr_test_I10_PR4
Created on 2022-10-26 with reprex v2.0.2
CodePudding user response:
This for()
loop works perfectly with your one (slightly modified) line of code and dynamic variable names
for(i in 1:3){
df <- df %>%
mutate(!!paste0("pr_test_",i) := ifelse(!!as.name(paste0("I10_PR",i)) %in% abl_pr, 1,0))
}
Data used:
abl_pr <- sample(LETTERS)[1:10]
I10_PR1 <- sample(LETTERS)
I10_PR2 <- sample(LETTERS)
I10_PR3 <- sample(LETTERS)
df <- data.frame(I10_PR1,I10_PR2,I10_PR3)