Home > Mobile >  Breaking the for loop does not provide the correct output
Breaking the for loop does not provide the correct output

Time:07-11

I fit three different kemans models to the iris data set. Then, I would like to compare their rand index (RI) using the for loop. If the RI of a model is larger than the second one, then stop the loop and return the largest values of RI.

For example, if the RI of the first model is larger than the RI of the second model, then break the for loop and provide the RI of the first model. In my example, the RI of the second model is larger than the third model. Hence, the for loop should break and provide me with the value of the second model fit2$cluster. When I run the model it returns me the number 1, which is the first model and it is not correct. If there is a way to return the name of the model would be even better. Any help, please?

Here is my try:

library(aricode)## contain the RI function
fit1 <- kmeans(iris[,-5], centers = 2)
fit2 <- kmeans(iris[,-5], centers = 3)
fit3 <- kmeans(iris[,-5], centers = 4)
fit <- list(fit1$cluster, fit2$cluster, fit3$cluster)

Here is my for loop

for(i in seq_along(fit)){
   if (RI(fit[[i]], iris[,5]) > RI(fit[[i 1]], iris[,5])) break
    # x <- RI(fit[[i]], iris[,5])
 print(i)
}

CodePudding user response:

Not sure why you want to print, you can only read it but can't do anything with it. Also break isn't needed, since you want to run the loop to the end.

Here we use model 1 as starting value and update it in every iteration.

w <- 1L
for (i in seq_along(fit)[-1L]) { 
  if (RI(fit[[i]], iris[, 5]) > RI(fit[[i - 1]], iris[, 5])) {
    w <- i
  } 
}
w
# [1] 2

RI(fit[[w]], iris[, 5]) 
# [1] 0.8797315

Alternatively, it would be much easier if RI() was Vectorized, so let's do it!

RIv <- Vectorize(RI, vectorize.args='c1')

RIv(fit, iris[, 5])
# [1] 0.7636689 0.8797315 0.8295302

To learn, which model has the larges value, we use which.max,

RIv(fit, iris[, 5]) |> which.max()
# [1] 2

to simply get the largest value, we pipe it into max,

RIv(fit, iris[, 5]) |> max()
# [1] 0.8797315

or all together:

RIv(fit, iris[, 5]) |> {\(.) {w=which.max(.); data.frame(model=w, value=.[w])}}()
#   model     value
# 1     2 0.8797315

CodePudding user response:

It's only printing 1, exactly because RI of your second model is larger than the third model and for that, if condition satisfies in the 2nd iteration and the loop breaks before printing 2, therefore, you have only 1 printed, instead try this

for(i in seq_along(fit)){
   if (RI(fit[[i]], iris[,5]) > RI(fit[[i 1]], iris[,5])) {
      print(i)
      break
     }    
}
  •  Tags:  
  • r
  • Related