Say you have the following data frame:
data.frame(A_1 = rnorm(10),
A_2 = rnorm(10),
B_1 = rnorm(10),
B_2 = rnorm(10),
B_3 = rnorm(10),
C_1 = rnorm(10))
I would like to run rowMeans
on a per-group basis, so calculate row means for A_1
with A2
, then B_1
with B_2
and B_3
, and C_1
only with itself.
In my real data, I have a variable number of groups and columns in each group, although I can know in advance the prefixes.
I guess starts_with
might help, but not sure how to exactly do this...
CodePudding user response:
df %>%
rownames_to_column %>%
pivot_longer(-rowname) %>%
mutate(name=str_remove(name, "_.")) %>%
pivot_wider(values_fn=mean)
rowname A B C
<int> <dbl> <dbl> <dbl>
1 1 -0.214 0.738 -2.49
2 2 0.513 1.00 0.402
3 3 0.0740 0.323 1.47
4 4 0.130 0.108 0.122
5 5 -0.0603 -0.321 -0.353
6 6 -0.969 0.118 0.0847
7 7 -0.992 -0.459 -0.174
8 8 -0.745 0.338 1.35
9 9 0.143 -0.373 -1.06
10 10 0.0105 -0.341 -0.367
CodePudding user response:
If I got your point right, this could be one way to do it. So first value of column A
is the mean first row of A_1
and A_2
and so forth.
You can split.default
function for this purpose. It split a data frame based on column values but first we need to create a vector whose length is equal to the number of columns.
do.call(cbind, lapply(split.default(df, gsub('(\\w)_\\d', '\\1', names(df))), rowMeans))
A B C
[1,] -0.27871412 1.1904718 -1.96108015
[2,] -0.14647757 -0.1312197 -2.56966908
[3,] 0.08151200 0.9202600 0.46074139
[4,] -0.58858702 -0.8668282 -0.24436829
[5,] -0.84074058 -0.3621379 0.55942184
[6,] 0.18514753 0.3805017 -0.54003693
[7,] -0.48473620 -0.1472854 -0.12439466
[8,] 0.06439061 0.3810837 -1.23546481
[9,] 0.62065939 0.7266633 0.90504540
[10,] -0.78956030 0.4548080 -0.02160544
CodePudding user response:
An option with map
library(dplyr)
library(purrr)
library(stringr)
un1 <- unique(str_remove(names(df1), "_\\d "))
map_dfc(setNames(un1, un1), ~ df1 %>%
select(starts_with(.x)) %>%
rowMeans)
-output
# A tibble: 10 × 3
A B C
<dbl> <dbl> <dbl>
1 0.151 0.153 -0.592
2 0.692 0.140 0.00446
3 -0.0661 0.732 -0.0713
4 0.110 0.232 -1.07
5 0.604 0.656 -1.42
6 -0.522 -1.42 0.573
7 -0.918 0.0367 -0.553
8 0.266 1.17 0.547
9 0.526 -0.0425 -0.297
10 0.798 -0.586 0.213