Home > database >  How to get the mean of specific columns in dataframe and store in vector (in R)
How to get the mean of specific columns in dataframe and store in vector (in R)

Time:10-26

I want to get the mean of specific columns in a dataframe and store those means in a vector in R.

The specific variable names of the columns are stored in a vector. For those specific variables (depends on user input) I want to calculate the mean and store those in a vector, over which I can loop then to use it in another part of my code.

I tried as follows, e.g.:

specific_variables <- c("variable1", "variable2")  # can be of a different length depending on user input
data <- data.frame(...)  # this is a dataframe with multiple columns, of which "variable1" and "variable2" are both columns from
mean_xm <- 0  # empty variable for storage purposes

# for loop over the variables
for (i in length(specific_variables)) {
  mean_xm[i] <- mean(data$specific_variables[i], na.rm = TRUE)
}

print(mean_xm)

I get an error saying Error: object of type 'closure' is not subsettable

Second attempt using sapply:

colMeans(data[sapply(data, is.numeric)])

But this gives me the means of all columns of the dataframe, but I only want to get those from the columns specified in specific_variables. Ideally, I'd like to store those means into a vector as I did in my first attempt.

CodePudding user response:

We may use

v1 <- unname(colMeans(data[specific_variables], na.rm = TRUE))
  • Related