Accessing variable name in for loop in R?-CodePudding

I am trying to run a for loop where I randomly subsample a dataset using sample_n command. I also want to name each new subsampled dataframe as "df1" "df2" "df3". Where the numbers correspond to i in the for loop. I know the way I wrote this code is wrong and why i am getting the error. How can I access "df" "i" in the for loop so that it reads as df1, df2, etc.? Happy to clarify if needed. Thanks!

for (i in 1:9){ print(get(paste("df", i, sep=""))) = sub %>% group_by(dietAandB) %>% sample_n(1) }

Error in print(get(paste("df", i, sep = ""))) = sub %>% group_by(dietAandB) %>% : target of assignment expands to non-language object

CodePudding user response：

Instead of using get you could use assign.

Using some fake example data:

library(dplyr, warn=FALSE)

sub <- data.frame(
  dietAandB = LETTERS[1:2]
)

for (i in 1:2) { 
  assign(paste0("df", i), sub %>% group_by(dietAandB) %>% sample_n(1) |> ungroup()) 
}
df1
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B
df2
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B

But the more R-ish way to do this would be to use a list instead of creating single objects:

df <- list(); for (i in 1:2) { df[[i]] = sub %>% group_by(dietAandB) %>% sample_n(1) |> ungroup() }

df
#> [[1]]
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B        
#> 
#> [[2]]
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B

Or more concise to use lapply instead of a for loop

df <- lapply(1:2, function(x) sub %>% group_by(dietAandB) %>% sample_n(1) |> ungroup())

df
#> [[1]]
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B        
#> 
#> [[2]]
#> # A tibble: 2 × 1
#>   dietAandB
#>   <chr>    
#> 1 A        
#> 2 B

CodePudding user response：

It depends on the sample size which is missing in your question. So, As an example I considered the mtcars dataset (32 rows) and sampling three subsamples of size 20 from the data:

library(dplyr)
for (i in 1:3) {
assign(paste0("df", i), sample_n(mtcars, 20))
}