Home > other >  Why subtracting an empty vector in R deletes everything?
Why subtracting an empty vector in R deletes everything?

Time:09-10

Could someone please enlighten me why subtracting an empty vector in R results in the whole content of a data frame being deleted? Just to give an example

WhichInstances2 <- which(JointProcedures3$JointID %in% KneeIDcount$Var1[KneeIDcount$Freq >1])

JointProcedures3 <-JointProcedures3[-WhichInstances2,] 

Will give me all blanks in JointProcedures3 if WhichInstances2 has all its value as FALSE, but it should simply give me what JointProcedures3 was before those lines of code.

This is not the first time it has happened to me and I have asked my supervisor and it has happened to him as well and he just thinks t is a quirk of R.

Rewriting the code as

WhichInstances2 <- which(JointProcedures3$JointID %in% KneeIDcount$Var1[KneeIDcount$Freq >1])

if(length(WhichInstances2)>0)
{
  JointProcedures3 <-JointProcedures3[-WhichInstances2,]
}

fixes the issue. But it should not have in principle made a scooby of a difference if that conditional was there or not, since if length(WhichInstances2) was equal to 0, I would simply be subtract nothing from the original JointProcedures3...

Thanks all for your input.

CodePudding user response:

It seems you are checking for ids in a vector and you intend to remove them from another; probably setdiff is what you are looking for.

Consider if we have a vector of the lowercase letters of the alphabet (its an r builtin) and we want to remove any entry that matches something that is not in there ("ab") , as programmers we would wish for nothing to be removed and keep our 26 letters

# wont work
letters[ - which(letters=="ab")]

#works
setdiff(letters  , which(letters=="ab"))
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u"
[22] "v" "w" "x" "y" "z"

CodePudding user response:

Let's try a simpler example to see what's happening.

x <- 1:5
y <- LETTERS[1:5]
which(x>4)
## [1] 5
y[which(x>4)]
## [1] "E"

So far so good ...

which(x>5)
## integer(0)
> y[which(x>5)]
## character(0)

This is also fine. Now what if we negate? The problem is that integer(0) is a zero-length vector, so -integer(0) is also a zero-length vector, so y[-which(x>5] is also a zero-length vector ..

What can you do about it? Don't use which(); instead use logical indexing directly, and use ! to negate the condition:

y[!(x>5)]
## [1] "A" "B" "C" "D" "E"

In your case:

JointID_OK <- (JointProcedures3$JointID %in% KneeIDcount$Var1[KneeIDcount$Freq >1])

JointProcedures3 <-JointProcedures3[!JointID_OK,] 
  • Related