I have a dataframe like this -
A1_A A1_B A1_C A1_D B1_A B1_B B1_C B1_D C1_A C1_B C1_C C1_D
1 0.86 0.9 0.75 0.65 0.12 0.35 0.45 0.44 0.2 0.4 0.6 0.7
2 ...
3 ...
I am trying to computing the mean of every 2 columns for every row using dplyr
. The expected output:-
A1_A A1_C B1_A B1_C C1_A C1_C
1 0.88 0.70 0.23 0.445 0.3 0.65
2 ...
3 ...
I understand rowsums
in basic R can do something similar. Is there a way to do it in dplyr
?
Although, in the example here n=2
, my actual data varies dynamically and n
can vary accordingly. A generalized method to aggregate row-wise data from n
columns is required.
Thanks!
Example for n=3
df <- structure(list(A1_A = 0.86, A1_B = 0.9, A1_C = 0.75, A1_D = 0.65,
A1_E = 0.6, A1_F = 0.65, B1_A = 0.12, B1_B = 0.35,
B1_C = 0.45, B1_D = 0.44, B1_E = 0.5, B1_F = 0.55,
C1_A = 0.2, C1_B = 0.4, C1_C = 0.6, C1_D = 0.7,
C1_E = 0.75, C1_F = 0.8), class = "data.frame", row.names = "1")
Output:
# A1_A A1_D B1_A B1_D C1_A C1_D
# 1 0.84 0.63 0.31 0.5 0.4 0.75
CodePudding user response:
An easy way is to sum up the odd and even columns and divide it by 2.
(df[seq(1, ncol(df), 2)] df[seq(2, ncol(df), 2)])/2
# A1_A A1_C B1_A B1_C C1_A C1_C
# 1 0.88 0.7 0.235 0.445 0.3 0.65
Generalization
n = 3
n <- 3
unname(sapply(seq(1, ncol(df), n), \(x) rowMeans(df[x:(x (n-1))])))
# [1] 0.8366667 0.6333333 0.3066667 0.4966667 0.4000000 0.7500000