Home > other >  Using for loops with mutate function?
Using for loops with mutate function?

Time:10-27

I have a task that's becoming quite difficult for me.

I have to create a variable (pr_test_1) to test whether a variable for a procedure (I10_PR1) is in a list of procedures, and this code is working great:

df <- df %>%
  mutate(pr_test_1=ifelse(I10_PR1 %in% abl_pr, 1,0))

However, I have 25 variables for procedures (I10_PR1 to I10_PR25) and I have to create one for each (pr_test_1 to pr_test_25).

I don't seem to find the right syntax to get a for loop to work.

Any help will be greatly appreciated!

CodePudding user response:

dplyr::across allows you to apply a function to multiple columns as specified with a selector (the below uses the starts_with selector).

library(dplyr)
library(purrr)

# sample data
df <- tibble::tibble(
  I10_PR1 = sample(100),
  I10_PR2 = sample(100),
  I10_PR3 = sample(100),
  I10_PR4 = sample(100)
)

# a sample list of values to compare against
match_list <- sample(10)

df %>%
  mutate(
    across(
      starts_with("I10_PR"),
      ~ if_else(.x %in% match_list, 1, 0),
      .names = "pr_test_{.col}"
    )
  )
#> # A tibble: 100 × 8
#>    I10_PR1 I10_PR2 I10_PR3 I10_PR4 pr_test_I10_PR1 pr_test_I10…¹ pr_te…² pr_te…³
#>      <int>   <int>   <int>   <int>           <dbl>         <dbl>   <dbl>   <dbl>
#>  1      93      45      47      46               0             0       0       0
#>  2      91      89      90      76               0             0       0       0
#>  3      16      32      30      24               0             0       0       0
#>  4      66      26      46      41               0             0       0       0
#>  5      53      51      79       9               0             0       0       1
#>  6      36      64      61      32               0             0       0       0
#>  7      45      75      23      25               0             0       0       0
#>  8      86      61      77      52               0             0       0       0
#>  9      17      87      64      53               0             0       0       0
#> 10       6      42      57      33               1             0       0       0
#> # … with 90 more rows, and abbreviated variable names ¹​pr_test_I10_PR2,
#> #   ²​pr_test_I10_PR3, ³​pr_test_I10_PR4

Created on 2022-10-26 with reprex v2.0.2

CodePudding user response:

This for() loop works perfectly with your one (slightly modified) line of code and dynamic variable names

for(i in 1:3){
df <- df %>%
  mutate(!!paste0("pr_test_",i) := ifelse(!!as.name(paste0("I10_PR",i)) %in% abl_pr, 1,0))
}

Data used:

abl_pr <- sample(LETTERS)[1:10]
I10_PR1 <- sample(LETTERS)
I10_PR2 <- sample(LETTERS)
I10_PR3 <- sample(LETTERS)

df <- data.frame(I10_PR1,I10_PR2,I10_PR3)
  • Related