R: What am I doing wrong in this for-loop with an if-else statement?-CodePudding

Sample data:

df <- data.frame(apples = c(1, 5, 3),
                 oranges = c(5, 3, 5))

Problem:

for(i in names(df)){
  
  if(sum(df$i == 5) == 1){
    
    print(paste("There is only 1 occurance of 5 fruit for", i))
    
  } else {
    
    print(paste("There is more than 1 occurance of 5 fruit for", i))
    
  }
}

this gives me

[1] "There is more than 1 occurance of 5 fruit for apples"
[1] "There is more than 1 occurance of 5 fruit for oranges"

however...

> sum(df$apples == 5)
[1] 1
> sum(df$oranges == 5)
[1] 2

My expected output:

[1] "There is only 1 occurance of 5 fruit for apples"
[1] "There is more than 1 occurance of 5 fruit for oranges"

I suspect it's some sort of syntax issue, or am I missing something more obvious?

CodePudding user response：

You need to use df[[i]] not df$i in your loop, otherwise it is finding variable i in the dataframe. df$i is NULL. sum(NULL == 5) is 0. You always do that else bit.

CodePudding user response：

Instead of summing the columns separately you could use colSums which is generally much faster. The result of a subsequent ifelse, which has names, can be piped into lapply to loop over the names (where _ is the placeholder for the piped object). sprintf then inserts them at the formal character %s. Gives a list as result.

ifelse(colSums(df == 5) > 1, 'only', 'more than') |>
  lapply(X=_, sprintf, fmt='There is %s 1 occurance of 5 fruits for')
# $apples
# [1] "There is more than 1 occurance of 5 fruits for"
# 
# $oranges
# [1] "There is only 1 occurance of 5 fruits for"

R >= 4.2 needed.