Home > Software design >  For loop in R for creating new data frames with respect to rows of a particular column
For loop in R for creating new data frames with respect to rows of a particular column

Time:03-14

Hello I have created a forloop to split my data with respect to a certain row as so:

for(i in 1:(nrow(df)))

{

team_[[i]] <- df %>% filter(team == i)

}

R doesn't like this, as it says team_ not found. the code does run if I include a list as such

team_ <- list()

for(i in 1:(nrow(df)))

{

team_[[i]] <- df %>% filter(team == i)

}

This works... However, I am given a list with thousands of empty items and just a few that contain my filtered data sets.

is there a simpler way to create the data sets without this list function?

thank you

CodePudding user response:

A simpler option is split from base R which would be faster than using == to subset in a loop

team_ <- split(df, df$team)

If we want to do some operations for each row, in tidyverse, it can be done with rowwise

library(dplyr)
df %>%
    rowwise %>%
    ... step of operations ...

or with group_by

df %>%
   group_by(team) %>%
   ...

CodePudding user response:

The methods akrun suggests are much better than a loop, but you should understand why this isn't working. Remember for(i in 1:nrow(df)) will give you one list item for i = 1, i = 2, etc, right up until i = nrow(df), which is several thousand by the sounds of thing. If you don't have any rows where team is 1, you will get an empty data frame as the first item, and the same will be true for every other value of i that isn't represented.

A loop like this would work:

for(i in unique(df$team)) team_[[i]] <- df %>% filter(team == i)

But I would stick to a non-looping method as described by akrun.

  • Related