How to include trim condition in mean function-CodePudding

I am trying to create my own mean function. I have the following line of code already. However, I want to add trim to the argument such that lower and upper bounds outliers are excluded. Please how do I do this?

Below is the mean function I currently have:

mymeanfunction <- function(x) {
  xbar <- sum(x)/length(x)
  xbar 
}

CodePudding user response：

Something like this? The function below accepts extra arguments na.rm and the dots argument so that the call can ask for other whiskers' lengths via boxplot.stats argument coef.

mymean <- function(x, trim, na.rm = FALSE, ...) {
  out <- boxplot.stats(x, ...)$out
  y <- x[!x %in% out]
  mean(y, na.rm = na.rm)
}

CodePudding user response：

The trim= argument of mean considers the percentage of quantiles to remove from head and tail of a vector before computing the mean. So you can write:

mymeanfunction <- function(x, trim=0) {
  if (trim > 0) {
    q <- quantile(x, c(0   trim, 1 - trim))
    x <- x[x > q[1] & x < q[2]]
  }
  xbar <- sum(x)/length(x)
  xbar
}

To implement NA handling you could enhance the function like this:

mymeanfunction <- function(x, trim=0, na.rm=FALSE) {
  if (na.rm) {
    x <- x[!is.na(x)]
  }
  if (anyNA(x)) {
    xbar <- NA_real_
  } else {
    if (trim > 0) {
      q <- quantile(x, c(0   trim, 1 - trim))
      x <- x[x > q[1] & x < q[2]]
    }
    xbar <- sum(x)/length(x)
  }
    xbar
}

mymeanfunction(x, na.rm=TRUE)
# [1] 0.6362622
mymeanfunction(x, trim=.1, na.rm=TRUE)
# [1] 0.66136

## compare
mean(x, na.rm=TRUE)
# [1] 0.6362622
mean(x, trim=.1, na.rm=TRUE)
# [1] 0.66136

If there's no NA in the data, we don't need to specify na.rm=TRUE.

Note, that mean will be much faster, since computation is implemented in C language. But for educational purposes you see what's going on now.

Data:

set.seed(42)
x <- c(runif(10), NA_real_)