How to calculate mean for specific columns in r?-CodePudding

The goal: I want to create 2 new columns by using R.

1 column which shows the mean of each row (but only calculating specific columns - only the mean of the columns which do not contain the string "_X")
1 column which shows the mean of each row (but only calculating specific columns - only the mean of the columns which do contain the string "_X").

For example:

phone1 phone1_X phon2 phone2_X phone3 phone3_X
1       2         3       4        5       6
2       4         6       8       10       12

enter image description here

Results:
Mean_of_none_X
3 (1 3 5)/3
6 (2 5 10)3

Mean_of_X
4 
8

Thank you!

CodePudding user response：

Try using rowMeans and grep over the column names to include/exclude certain columns:

# only "_x"
 rowMeans(df[,grep("_x",colnames(df))])

# No "_x"
rowMeans(df[,-grep("_x",colnames(df))])

Output:

#> # only "_x"
#> rowMeans(df[,grep("_x",colnames(df))])
#[1] 4 8
#> # No "_x"
#> rowMeans(df[,-grep("_x",colnames(df))])
#[1] 3 6

CodePudding user response：

Try this

> lapply(split.default(df, endsWith(names(df), "_X")), rowMeans)
$`FALSE`
[1] 3 6

$`TRUE`
[1] 4 8

CodePudding user response：

library(dplyr)

df %>%
  rowwise() %>%
  mutate(x_mean = mean(c_across(contains('_X'))),
         notx_mean = mean(c_across(!contains('_X') & !contains('_mean'))))