I have a dataframe with 20 Columns and 20 rows. I want to calculate the median of each five row for each Column Using a loop or a function. I need 20 columns with five median for each of them.
CodePudding user response:
library(dplyr)
set.seed(1)
matrix(rexp(400, rate=.1), ncol=20) %>%
as_tibble(.name_repair = ~paste0('X', 1:20)) %>%
group_by(id = rep(1:4, each = 5)) %>%
summarise(
across(everything(), median)
)
# A tibble: 4 x 21
id X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 4.36 5.66 10.8 11.1 8.19 5.16 7.43 10.3 14.8 7.51 6.75 10.9 13.2 7.37 6.92 17.8 4.63 9.56 7.17 5.03
2 2 9.57 9.97 9.95 8.37 13.5 7.63 11.9 8.05 21.1 5.42 6.73 6.22 4.79 10.1 4.33 12.8 6.55 5.00 7.61 3.43
3 3 12.4 3.24 5.95 17.8 6.81 14.2 3.79 17.4 12.9 8.02 10.3 5.62 7.22 5.21 2.92 2.65 6.85 4.29 5.85 0.542
4 4 6.55 7.25 7.74 8.15 2.64 2.99 7.85 7.12 7.62 7.37 6.63 6.46 7.50 12.9 10.9 6.59 1.93 10.5 4.68 11.4
CodePudding user response:
In base R you could accomplish the same by using tapply:
set.seed(1)
m <- matrix(rexp(400, rate=.1), ncol=20)
t(tapply(m, list(col(m), (row(m)-1) %/% 5), median))
1 2 3 4 5 6 7
0 4.360686 5.658655 10.798811 11.081767 8.185142 5.162061 7.430436 ...
1 9.565675 9.968130 9.945558 8.370065 13.456440 7.631800 11.910946 ...
2 12.376036 3.240102 5.946177 17.847654 6.812291 14.195492 3.788268 ...
3 6.547466 7.252143 7.741878 8.145358 2.637383 2.991589 7.851209 ...