Home > Net >  Fast replacing of values in a large vectors
Fast replacing of values in a large vectors

Time:09-28

Let's suppose I have a very large vector:

set.seed(1)
my.vec <- runif(n = 1000000, min = -20, max = 20)

Now I would like to replace the values below -5 and above 5 with NA. This can be done by using a loop:

for (i in 1:length(my.vec)) {
  if (my.vec[i] < -5 | my.vec[i] > 5) {
    my.vec[i] = NA
  } else {
    my.vec[i] = my.vec[i]
  }
}

Even though it is not the case in this simple example, the problem with this is that it takes quite a long time when the data structure is a bit more complicated (for example when it is applied on nested data frames in a list).

This is why I was wondering if there are faster and more efficient ways to replace certain values by NA than using a loop with conditions?

Anybody with a hint?

CodePudding user response:

my.vec[abs(my.vec) > 5] <- NA

CodePudding user response:

Use:

my.vec[my.vec < -5 | my.vec > 5] <- NA

Or:

my.vec[my.vec < -5 | my.vec > 5] = NA

Or use ifelse:

my.vec = ifelse((my.vec < -5 | my.vec > 5), NA, my.vec)

CodePudding user response:

Function is.na<- is meant for this.

is.na(x) <- my.vec < -5 | my.vec > 5

or, following Yuriy's suggestion,

is.na(my.vec) <- abs(my.vec) > 5
  •  Tags:  
  • r
  • Related