Home > Enterprise >  How can I write a loop in R for the following problem?
How can I write a loop in R for the following problem?

Time:08-18

Year Score 1 Score 2
2012 34 45
2012 41 46
2013 31 44
2013 44 33
2014 35 56
2014 42 21

I wrote this but it gives me the final year only, I am a newbie and could not find the similar example as my case, can someone help me?

CodePudding user response:

If you want to do it by a for loop, you can do it by a for loop.

Taking the values from @dcarlson

newdf <- structure(list(Year = c(2012L, 2012L, 2013L, 2013L, 2014L, 2014L),
Score1 = c(34L, 41L, 31L, 44L, 35L, 42L), 
Score2 = c(45L, 46L, 44L, 33L, 56L, 21L)), class = "data.frame", 
row.names = c(NA, -6L))

I am not familiar with bestNormalize so I will just add the three values per row. The important thing is, that you need some place to store your values in and that should be a list as in

result <- list()

Now we can run a loop and append to that list whatever we have calculated:

for (i in 1:3){
  cat(i);cat(" - processing year ");cat(i 2011);cat("\n") # FYI
  tmp = newdf[newdf$Year==i 2011,]
  abc = sum(tmp[1,1], tmp[1,2], tmp[1,3]) # replace by your function
  result <- append(result, abc)  # accumulating results in a list
}

print(result)
str(result)

Because I just added three numbers, the result per year is just a number so in my case the result is just a list of three sums.

You may want to throw in a call to names so that you'll remember, which year made which list entry:

result <- list()
for (i in 1:3){
  tmp = newdf[newdf$Year==i 2011,]
  abc = sum(tmp[1,1], tmp[1,2], tmp[1,3]) #replace by your function
  names(abc) <- 2011 i
  result <- append(result, abc)  #accumulating results with names in a list
}

print(result)
str(result)

CodePudding user response:

If you want to use a loop, you will need to define abc as a matrix or data.frame and index it to store each set of results. It would be simpler to just use lapply and sapply. I can't test this with bestNormalize because it does not work with the sample sizes in your example. First provide reproducible data rather than a table using dput(newdf):

newdf <- structure(list(Year = c(2012L, 2012L, 2013L, 2013L, 2014L, 2014L
), Score1 = c(34L, 41L, 31L, 44L, 35L, 42L), Score2 = c(45L, 
46L, 44L, 33L, 56L, 21L)), class = "data.frame", row.names = c(NA, 
-6L))

Then split into years:

df.splt <- split(newdf, newdf$Year)

Then use lapply:

df.lst <- lapply(df.splt, function(x) sapply(x[, -1], scale, center=FALSE, scale=TRUE))
df.lst
# $`2012`
#         Score1    Score2
# [1,] 0.6383359 0.6992942
# [2,] 0.7697580 0.7148340
# 
# $`2013`
#         Score1 Score2
# [1,] 0.5759535    0.8
# [2,] 0.8174824    0.6
# 
# $`2014`
#         Score1    Score2
# [1,] 0.6401844 0.9363292
# [2,] 0.7682213 0.3511234

The object df.lst is a list containing matrices for each set of results.

  • Related