I currently have the below data frame
and am trying to develop a list of 5 unique data frames
containing every 5th row of the original df
. Is there a way to select every other 5th row and add it to a new data frame in a list? Either using a for loop
or lapply
?
df
X1 X2 X3 X4 X5
1 0 0 1.501990 0
2 0 0 1.883904 0
3 0 0 1.333195 0
4 0 0 0.000000 0
5 0 0 2.136760 0
6 0 0 2.186790 0
7 0 0 1.269592 0
8 0 0 1.458405 0
9 0 0 1.816493 0
10 0 0 0.000000 0
11 0 0 2.190029 0
12 0 0 0.000000 0
13 0 0 1.460534 0
14 0 0 1.470776 0
15 0 0 1.675406 0
16 0 0 1.842470 0
17 0 0 1.937999 0
18 0 0 0.000000 0
19 0 0 1.649926 0
20 0 0 2.067902 0
For example, the first data frame
would consist of the 1st, 6th, 11th, and 16th row, while the next would start with the 2nd row and carry on down the rows of the df
?
CodePudding user response:
Use split
with 1:5
to create dataframes with a 5-row interval.
split(df, 1:5)
output
$`1`
X1 X2 X3 X4 X5
1 1 0 0 1.501990 0
6 6 0 0 2.186790 0
11 11 0 0 2.190029 0
16 16 0 0 1.842470 0
$`2`
X1 X2 X3 X4 X5
2 2 0 0 1.883904 0
7 7 0 0 1.269592 0
12 12 0 0 0.000000 0
17 17 0 0 1.937999 0
$`3`
X1 X2 X3 X4 X5
3 3 0 0 1.333195 0
8 8 0 0 1.458405 0
13 13 0 0 1.460534 0
18 18 0 0 0.000000 0
$`4`
X1 X2 X3 X4 X5
4 4 0 0 0.000000 0
9 9 0 0 1.816493 0
14 14 0 0 1.470776 0
19 19 0 0 1.649926 0
$`5`
X1 X2 X3 X4 X5
5 5 0 0 2.136760 0
10 10 0 0 0.000000 0
15 15 0 0 1.675406 0
20 20 0 0 2.067902 0
An alternative with dplyr::group_split
is:
group_split(df, rep(1:5, nrow(df)/5), .keep = F)
data
df <- structure(list(X1 = 1:20, X2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X3 = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L), X4 = c(1.50199, 1.883904, 1.333195, 0, 2.13676,
2.18679, 1.269592, 1.458405, 1.816493, 0, 2.190029, 0, 1.460534,
1.470776, 1.675406, 1.84247, 1.937999, 0, 1.649926, 2.067902),
X5 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA,
-20L))
CodePudding user response:
A possible solution, based on dplyr::group_split
:
library(dplyr)
df %>%
mutate(id = 0:(nrow(df)-1) %% 4) %>%
group_by(id) %>%
group_split(.keep = F) %>%
as.list
#> [[1]]
#> # A tibble: 5 × 5
#> X1 X2 X3 X4 X5
#> <int> <int> <int> <dbl> <int>
#> 1 1 0 0 1.50 0
#> 2 5 0 0 2.14 0
#> 3 9 0 0 1.82 0
#> 4 13 0 0 1.46 0
#> 5 17 0 0 1.94 0
#>
#> [[2]]
#> # A tibble: 5 × 5
#> X1 X2 X3 X4 X5
#> <int> <int> <int> <dbl> <int>
#> 1 2 0 0 1.88 0
#> 2 6 0 0 2.19 0
#> 3 10 0 0 0 0
#> 4 14 0 0 1.47 0
#> 5 18 0 0 0 0
#>
#> [[3]]
#> # A tibble: 5 × 5
#> X1 X2 X3 X4 X5
#> <int> <int> <int> <dbl> <int>
#> 1 3 0 0 1.33 0
#> 2 7 0 0 1.27 0
#> 3 11 0 0 2.19 0
#> 4 15 0 0 1.68 0
#> 5 19 0 0 1.65 0
#>
#> [[4]]
#> # A tibble: 5 × 5
#> X1 X2 X3 X4 X5
#> <int> <int> <int> <dbl> <int>
#> 1 4 0 0 0 0
#> 2 8 0 0 1.46 0
#> 3 12 0 0 0 0
#> 4 16 0 0 1.84 0
#> 5 20 0 0 2.07 0