Let's suppose I have a very large vector:
set.seed(1)
my.vec <- runif(n = 1000000, min = -20, max = 20)
Now I would like to replace the values below -5 and above 5 with NA
. This can be done by using a loop:
for (i in 1:length(my.vec)) {
if (my.vec[i] < -5 | my.vec[i] > 5) {
my.vec[i] = NA
} else {
my.vec[i] = my.vec[i]
}
}
Even though it is not the case in this simple example, the problem with this is that it takes quite a long time when the data structure is a bit more complicated (for example when it is applied on nested data frames in a list).
This is why I was wondering if there are faster and more efficient ways to replace certain values by NA
than using a loop with conditions?
Anybody with a hint?
CodePudding user response:
my.vec[abs(my.vec) > 5] <- NA
CodePudding user response:
Use:
my.vec[my.vec < -5 | my.vec > 5] <- NA
Or:
my.vec[my.vec < -5 | my.vec > 5] = NA
Or use ifelse
:
my.vec = ifelse((my.vec < -5 | my.vec > 5), NA, my.vec)
CodePudding user response:
Function is.na<-
is meant for this.
is.na(x) <- my.vec < -5 | my.vec > 5
or, following Yuriy's suggestion,
is.na(my.vec) <- abs(my.vec) > 5