Home > Enterprise >  Loop over multiple data frames with mathematical function
Loop over multiple data frames with mathematical function

Time:07-15

I have 5 data frames, split from one according to a variable, to which I want to apply the same function based on the same 3 columns from each data frame. Each contains 10,000 rows.

My data:

   Dist     X    Y  deg ofs      Z
1 20.21 499.3 3577 4.77   0 19.750
2 20.23 482.3 3578 4.77 -50 19.731
3 20.23 481.3 3578 4.77 -25 19.741
4 20.23 480.3 3578 4.77   0 19.750
5 20.23 479.3 3578 4.77  25 19.749
6 20.24 478.3 3578 4.77  50 19.740

Split like this:

splitdf <- split(df, df$ofs)
str(offset)
X1 <- splitdf$`-50`
X2 <- splitdf$'-25'
X3 <- splitdf$'0'
X4 <- splitdf$'25'
X5 <- splitdf$'50'
df.list <- list(X1,X2,X3,X4,X5)

I have created two functions of trig:

(X distance * cos(angle)), (Y - distance * sin(angle))

NewX <- function(x){
  df.list[[i]][2]   df.list[[i]][5] * cos(df.list[[i]][4]) 
}
NewY <- function(x) {
  df.list[[i]][3] - df.list[[i]][5] * sin(df.list[[i]][4]) 
}

I then created a loop to apply these functions to each data frame, thus creating new columns.

for (i in 1:length(df.list)){
  df.list[[i]]$newcol1 <-  lapply(df.list[[i]]$X, FUN=NewX)
  df.list[[i]]$newcol2 <- lapply(df.list[[i]]$Y, FUN=NewY)
}    

Unfortunately this yields no results nor error messages. But the console is busy for a few minutes.

I tried again with the data before splitting to separate data frames using:

NewX <- function(x){
  df[2]   df[5] * cos(df[4]) 
}
NewY <- function(x) {
  df[3] - df[5] * sin(df[4]) 
}

for (i in 1:length(df)){
  df$newX <-  lapply(df$X, FUN=NewX)
  df$newY <- lapply(df$Y, FUN=NewY)
}  

This way is too heavy and does not yield result after one hour. In either case I don't get any error messages so it is very difficult to know what I'm doing wrong.

Does anyone have any ideas? Thanks!

EDIT

I ran the loop over the single file changing the code to add output as a new data frame.

for (i in 1:length(df)){
  lapply(df$X, FUN=NewX)
 lapply(df$Y, FUN=NewY) -> newdf
}    

A NewX column is created, and inside each cell is a single-column data frame with 50,000 results. Removing the loop and running with a pipe yields Error in FUN(X[[i]],...): Unused argument

CodePudding user response:

Actually you could do that with by.

fun <- function(x) cbind(x, newcol1=x[, 2]   x[, 5]*cos(x[, 4]), newcol2=x[, 3] - x[, 5]*sin(x[, 4]))

by(df, df$ofs, fun)
# df$ofs: -50
#    Dist     X    Y  deg ofs      Z newcol1  newcol2
# 2 20.23 482.3 3578 4.77 -50 19.731 479.421 3528.083
# --------------------------------------------------------------------------------------------- 
#   df$ofs: -25
#    Dist     X    Y  deg ofs      Z  newcol1  newcol2
# 3 20.23 481.3 3578 4.77 -25 19.741 479.8605 3553.041
# --------------------------------------------------------------------------------------------- 
#   df$ofs: 0
#    Dist     X    Y  deg ofs     Z newcol1 newcol2
# 1 20.21 499.3 3577 4.77   0 19.75   499.3    3577
# 4 20.23 480.3 3578 4.77   0 19.75   480.3    3578
# --------------------------------------------------------------------------------------------- 
#   df$ofs: 25
#    Dist     X    Y  deg ofs      Z  newcol1  newcol2
# 5 20.23 479.3 3578 4.77  25 19.749 480.7395 3602.959
# --------------------------------------------------------------------------------------------- 
#   df$ofs: 50
#    Dist     X    Y  deg ofs     Z newcol1  newcol2
# 6 20.24 478.3 3578 4.77  50 19.74 481.179 3627.917

If you plan to reassemble it:

do.call(rbind, by(df, df$ofs, fun))
#      Dist     X    Y  deg ofs      Z  newcol1  newcol2
# -50 20.23 482.3 3578 4.77 -50 19.731 479.4210 3528.083
# -25 20.23 481.3 3578 4.77 -25 19.741 479.8605 3553.041
# 0.1 20.21 499.3 3577 4.77   0 19.750 499.3000 3577.000
# 0.4 20.23 480.3 3578 4.77   0 19.750 480.3000 3578.000
# 25  20.23 479.3 3578 4.77  25 19.749 480.7395 3602.959
# 50  20.24 478.3 3578 4.77  50 19.740 481.1790 3627.917
  • Related