Hello I have created a forloop to split my data with respect to a certain row as so:
for(i in 1:(nrow(df)))
{
team_[[i]] <- df %>% filter(team == i)
}
R doesn't like this, as it says team_ not found. the code does run if I include a list as such
team_ <- list()
for(i in 1:(nrow(df)))
{
team_[[i]] <- df %>% filter(team == i)
}
This works... However, I am given a list with thousands of empty items and just a few that contain my filtered data sets.
is there a simpler way to create the data sets without this list function?
thank you
CodePudding user response:
A simpler option is split
from base R
which would be faster than using ==
to subset in a loop
team_ <- split(df, df$team)
If we want to do some operations for each row, in tidyverse
, it can be done with rowwise
library(dplyr)
df %>%
rowwise %>%
... step of operations ...
or with group_by
df %>%
group_by(team) %>%
...
CodePudding user response:
The methods akrun suggests are much better than a loop, but you should understand why this isn't working. Remember for(i in 1:nrow(df))
will give you one list item for i = 1, i = 2, etc, right up until i = nrow(df)
, which is several thousand by the sounds of thing. If you don't have any rows where team
is 1, you will get an empty data frame as the first item, and the same will be true for every other value of i
that isn't represented.
A loop like this would work:
for(i in unique(df$team)) team_[[i]] <- df %>% filter(team == i)
But I would stick to a non-looping method as described by akrun.