R: For loop says object not found-CodePudding

I have a dataset named trainset and I'm trying to use a for loop to iterate through the specific columns and sums the values up, repeating for all rows in the dataset.

The loop looks like this and it works fine, giving me 825 results :

for (i in 1:nrow(trainset)) {
  Systolic.BP[i] <- trainset$`Systolic blood pressure`[i]
  BUN[i] <- trainset$`Urea nitrogen`[i]
  Sodium[i]  <- trainset$`Blood sodium`[i]
  Age[i]  <- trainset$age[i]
  HR[i]  <- trainset$`heart rate`[i]
  COPD[i]  <- trainset$COPD[i]
  
  outcome.pred.gwtg[i]= m.gwtg(Systolic.BP[i], BUN[i], Sodium[i], Age[i], HR[i], COPD[i])

But when I actually got an error: Error: object 'Systolic.BP' not found

Anyone knows how to solve this problem? Thanks!

CodePudding user response：

The reason you are getting the error is that the first time the loop runs, the line

Systolic.BP[i] <- trainset$`Systolic blood pressure`[i]

Tries to write the first entry of trainset$'Systolic blood pressure' into the first position of a vector called Systolic.BP. But this vector doesn't exist yet.

If you are using the subsetting operator [, you need to have the vector already defined. For example, I get an error if I do:

for(i in 1:10) {
   x[i] <- i
 }
#> Error: object 'x' not found

This is because x doesn't exist when I try to write to its first position. The correct way to do this loop would be

x <- numeric(10)
for(i in 1:10) {
   x[i] <- i
 }
x
#> [1]  1  2  3  4  5  6  7  8  9 10

It's not clear to me why you need to write each variable separately for passing to the function inside the loop anyway - you could just do:

outcome.pred.gwtg <- numeric(nrow(trainset))

for (i in 1:nrow(trainset)) {
  
  outcome.pred.gwtg[i] <- m.gwtg(trainset$`Systolic blood pressure`[i], 
                                 trainset$`Urea nitrogen`[i],
                                 trainset$`Blood sodium`[i],
                                 trainset$age[i],
                                 trainset$`heart rate`[i],
                                 trainset$COPD[i])
}

Another option, since you are only using the new variable names inside the loop, is to do:

outcome.pred.gwtg <- numeric(nrow(trainset))

for (i in 1:nrow(trainset)) {
  Systolic.BP <- trainset$`Systolic blood pressure`[i]
  BUN         <- trainset$`Urea nitrogen`[i]
  Sodium      <- trainset$`Blood sodium`[i]
  Age         <- trainset$age[i]
  HR          <- trainset$`heart rate`[i]
  COPD        <- trainset$COPD[i]
  
  outcome.pred.gwtg[i]= m.gwtg(Systolic.BP, BUN, Sodium, Age, HR, COPD)
}

Also, note that there's no point in filling vectors this way in the first place. You can do it outside the loop:

Systolic.BP <- trainset$`Systolic blood pressure`
BUN         <- trainset$`Urea nitrogen`
Sodium      <- trainset$`Blood sodium`
Age         <- trainset$age
HR          <- trainset$`heart rate`
COPD        <- trainset$COPD

outcome.pred.gwtg <- numeric(nrow(trainset))

for (i in 1:nrow(trainset)) {
outcome.pred.gwtg[i]= m.gwtg(Systolic.BP[i], BUN[i], Sodium[i], Age[i], HR[i], COPD[i])
}

CodePudding user response：

Your function m.gwtg(...) can't find the i'th vector element Systolic.BP[i] because you apparently haven't created the vector Systolic.BP itself before. Anyhow: you're working with a data.frame ("trainset"), and there's a couple of more efficient ways to do this in R.

Example (using dplyr):

library(dplyr)

trainset %>%
  rename(
    Systolic.BP = `Systolic blood pressure`,
    ## other renaming instructions
    ## of the form new_name = old_name ...
    HR = `heart rate`
  ) %>%
  rowwise %>%
  mutate( 
    outcome.pred.gwtg = m.gwtg(Systolic.BP,
                               ## other renamed predictors ...
                               COPD)
  )