Sample data:
df <- data.frame(apples = c(1, 5, 3),
oranges = c(5, 3, 5))
Problem:
for(i in names(df)){
if(sum(df$i == 5) == 1){
print(paste("There is only 1 occurance of 5 fruit for", i))
} else {
print(paste("There is more than 1 occurance of 5 fruit for", i))
}
}
this gives me
[1] "There is more than 1 occurance of 5 fruit for apples"
[1] "There is more than 1 occurance of 5 fruit for oranges"
however...
> sum(df$apples == 5)
[1] 1
> sum(df$oranges == 5)
[1] 2
My expected output:
[1] "There is only 1 occurance of 5 fruit for apples"
[1] "There is more than 1 occurance of 5 fruit for oranges"
I suspect it's some sort of syntax issue, or am I missing something more obvious?
CodePudding user response:
You need to use df[[i]]
not df$i
in your loop, otherwise it is finding variable i
in the dataframe. df$i
is NULL. sum(NULL == 5)
is 0. You always do that else
bit.
CodePudding user response:
Instead of summing the columns separately you could use colSums
which is generally much faster. The result of a subsequent ifelse
, which has names, can be piped into lapply
to loop over the names (where _
is the placeholder for the piped object). sprintf
then inserts them at the formal character %s
. Gives a list as result.
ifelse(colSums(df == 5) > 1, 'only', 'more than') |>
lapply(X=_, sprintf, fmt='There is %s 1 occurance of 5 fruits for')
# $apples
# [1] "There is more than 1 occurance of 5 fruits for"
#
# $oranges
# [1] "There is only 1 occurance of 5 fruits for"
R >= 4.2 needed.