Home > Blockchain >  R function with some default argument dependent on other argument
R function with some default argument dependent on other argument

Time:09-23

I have a function with one default argument depending on another argument. And I have some strange behaviour. In the example the argument is colheads=colnames(data) which depends on data.

aaa <- head(iris)
fun <- function(data, colheads=colnames(data), rownames=1:nrow(data), rownames.label="Rowlabel"){
       data <- cbind(rownames, data)
       colnames(data) <- c(rownames.label, colheads)
}
fun(aaa)

Here I get an error on the last line colnames(data). It looks like the colheads argument is updated because data itself is updated the line before.

Because if I try to run this code without function, there is no error.

data <- aaa
colheads <- colnames(data)
rownames.text <- 1:nrow(data)
rownames.label <- "Rowlabel"
data <- cbind(rownames, data)
colnames(data) <- c(rownames.label, colheads)

Then I tried to add some print within the function to check where it happens (the debug function also spots the last line). With this, I still get the error. Again, looks like colheads is updated.

aaa <- head(iris)
fun <- function(data, colheads=colnames(data), rownames=1:nrow(data), rownames.label="Rowlabel"){
       data <- cbind(rownames, data)
       print(colheads)
       colnames(data) <- c(rownames.label, colheads)
}
fun(aaa)

But if I also add a print before data is updated, the error disappears.

aaa <- head(iris)
fun <- function(data, colheads=colnames(data), rownames=1:nrow(data), rownames.label="Rowlabel"){
       print(colheads) 
       data <- cbind(rownames, data)
       print(colheads)
       colnames(data) <- c(rownames.label, colheads)
}
fun(aaa)

I found a workaround using a temporary variable colheads.temp below.

aaa <- head(iris)
fun <- function(data, colheads=colnames(data), rownames=1:nrow(data), rownames.label="Rowlabel"){
       colheads.temp <- colheads       
       data <- cbind(rownames, data)
       colnames(data) <- c(rownames.label, colheads.temp)
}
fun(aaa)

But still, as I am unsure about how R functions work, I am puzzled. Do someone knows what is going on and how R functions actually work?

CodePudding user response:

Yes, this is called lazy evaluation. The argument colheads=colnames(data) is not evaluated until colheads is used inside the function. And it will use the current value of data at the time it is evaluated. This is nice, because if colheads is never called, then it is never evaluated, making code faster (it has other benefits too, but also drawbacks).

The force function is made to formalize your workaround, force(colheads) as one of the first lines of your function would force the evaluation and lock in the definition of colheads with the current value of the data.

If you'd like to learn more, I'd suggest reading the Functions chapter of Advanced R, or at least the section on lazy evalatuion.

  • Related