I have a dataset (insti) and I want to create 3 different subsets according to a factor (xarxa) with three levels (linkedin, instagram, twitter). I used this:
linkedin <- subset(insti, insti$xarxa=="linkedin")
twitter <- subset(insti, insti$xarxa=="twitter")
instagram <- subset(insti, insti$xarxa=="instagram")
It does work, however, I was wondering if this can be done with tapply, so I tried:
tapply(insti, insti$xarxa, subset)
It gives this error:
Error in tapply(insti, insti$xarxa, subset) : arguments must have same length
I think that there might be some straigth forward way to do this but I can not work it out. Can you help me with this without using loops? Thanks a lot.
CodePudding user response:
It's usually better to deal with data frames in a named list. This makes them easy to iterate over, and stops your global workspace being filled up with lots of different variables. The easiest way to get a named list is with split(insti, insti$xarxa)
.
If you really want the variables written directly to your global environment rather than in a list with a single line, you can do
list2env(split(insti, insti$xarxa), globalenv())
Example
Obviously, I don't have the insti
data frame, since you did not supply any example data in your question, but we can demonstrate that the above solution works using the built-in iris
data set.
First we can see that my global environment is empty:
ls()
#> character(0)
Now we get the iris
data set, split it by species, and put the result in the global environment:
list2env(split(datasets::iris, datasets::iris$Species), globalenv())
#> <environment: R_GlobalEnv>
So now when we check the global environment's contents, we can see that we have three data frames: one for each Species
:
ls()
#> [1] "setosa" "versicolor" "virginica"
head(setosa)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3.0 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
And of course, we can also access versicolor
and virginica
in the same way
Created on 2021-11-12 by the reprex package (v2.0.0)