Home > Net >  How to include trim condition in mean function
How to include trim condition in mean function

Time:02-03

I am trying to create my own mean function. I have the following line of code already. However, I want to add trim to the argument such that lower and upper bounds outliers are excluded. Please how do I do this?

Below is the mean function I currently have:

mymeanfunction <- function(x) {
  xbar <- sum(x)/length(x)
  xbar 
}

CodePudding user response:

Something like this? The function below accepts extra arguments na.rm and the dots argument so that the call can ask for other whiskers' lengths via boxplot.stats argument coef.

mymean <- function(x, trim, na.rm = FALSE, ...) {
  out <- boxplot.stats(x, ...)$out
  y <- x[!x %in% out]
  mean(y, na.rm = na.rm)
}

CodePudding user response:

The trim= argument of mean considers the percentage of quantiles to remove from head and tail of a vector before computing the mean. So you can write:

mymeanfunction <- function(x, trim=0) {
  if (trim > 0) {
    q <- quantile(x, c(0   trim, 1 - trim))
    x <- x[x > q[1] & x < q[2]]
  }
  xbar <- sum(x)/length(x)
  xbar
}

To implement NA handling you could enhance the function like this:

mymeanfunction <- function(x, trim=0, na.rm=FALSE) {
  if (na.rm) {
    x <- x[!is.na(x)]
  }
  if (anyNA(x)) {
    xbar <- NA_real_
  } else {
    if (trim > 0) {
      q <- quantile(x, c(0   trim, 1 - trim))
      x <- x[x > q[1] & x < q[2]]
    }
    xbar <- sum(x)/length(x)
  }
    xbar
}

mymeanfunction(x, na.rm=TRUE)
# [1] 0.6362622
mymeanfunction(x, trim=.1, na.rm=TRUE)
# [1] 0.66136

## compare
mean(x, na.rm=TRUE)
# [1] 0.6362622
mean(x, trim=.1, na.rm=TRUE)
# [1] 0.66136

If there's no NA in the data, we don't need to specify na.rm=TRUE.

Note, that mean will be much faster, since computation is implemented in C language. But for educational purposes you see what's going on now.


Data:

set.seed(42)
x <- c(runif(10), NA_real_)
  • Related