Home > OS >  Defining Function Arguments with "()" vs. " [ ]"
Defining Function Arguments with "()" vs. " [ ]"

Time:01-03

I have seen functions where the input arguments are defined using () and other functions where the input arguments are defined using [ ].

For instance, here I can define a function using ():

my_function <- function(x1, x2, x3) {
    
    answer = x1   x2   x3
    return(answer)
    
}

If I want to call this function, then I can do so as follows:

my_function(5,1,2)
[1] 8

I can also define the same function using [ ]. For example:

f_1 <- function(x) {

  answer =  x[[1]]   x[[2]]   x[[3]]
 
  return(answer)

}

# does not work

 f(5,1,2)

Error in f(5,1,2) : unused arguments (1,2)

# works:

f_1(c(5,1,2))

[1] 8

Does anyone know why there are two different ways of defining functions in R? Is one of them more advantageous for certain tasks? Are there any differences?

Thanks!

CodePudding user response:

There's already an answer to your question, but here I'll offer one from a different point of view.

In R, objects can contain more than one value. You can create a vector using

x <- c(1, 4, 9)

and it contains the 3 values. You can access those values using syntax like x[3] (which would give you the value 9). You can also use x[[3]]; for simple vectors like x, you'll get the same result.

Objects can also be more complicated. A common way to create these complicated objects is using list, for example

mylist <- list(x = c(1, 4, 9), y = c(4, 5, 6))

Now mylist[[2]] will give you the vector c(4, 5, 6). (You can also use mylist[2] for a slightly different result, but a good practice as a beginner is to use [[ for lists, and [ for vectors.)

R also contains functions. There are built-in functions like c() and list() and mean(), and user-defined functions like your my_function(). When I write them with the parens after, it's just a convention to indicate that I'm talking about a function; the actual names are c, list, mean, and my_function.

When you actually use the parens in code, they indicate that you want to call the function, i.e. associate values with each of its arguments, and evaluate its body to carry out some computation and return some value.

Finally, to answer your question, which you start by saying "I have seen functions where the input arguments are defined using () and other functions where the input arguments are defined using [ ]." I'd describe what you're seeing differently.

When the function call looks like my_function(5,1,2), you're saying you have three arguments to your function, and you want to associate the three values 5, 1, and 2 with them.

When the function call looks like my_function(c(5,1,2)) and you refer to the values using x[[1]], x[[2]], etc., you're saying you have one argument to your function, but it's a complex object containing more than one value.

When is one better than the other? That depends entirely on the context. If those 3 values are always part of a single conceptual object, then making them part of a single R object makes sense. An example would be working with points in 3D: they have x, y, and z coordinates, that are all part of describing the point.

If they just happen to be 3 things that are unrelated but are needed for the computation, then it won't make sense to put them into a single object, and in fact, putting them in a list just to call the function that way would be inefficient.

I hope this gives you some more insight into R.

CodePudding user response:

The parentheses are used for primitive R functions, as you show above. So, the ( will provide the "result of evaluating the argument." The square bracket on the other hand is essentially an index or the "operator acting on vectors, matrices, arrays and lists to extract or replace parts." Let's look at some examples:

x1 <- c(1, 2, 3, 4)
x2 <- c(2, 4, 6, 8)
x3 <- c(3, 6, 9, 12)

So, if we run your first function, then it will add the elements from each list from each index position.

my_function <- function(x1, x2, x3) {

answer = x1   x2   x3
return(answer)

}

my_function(x1, x2, x3)

# [1]  6 12 18 24

But with the next function, you have to give it a vector, matrix, dataframe, etc. So, we could use x1 and it will add all three elements from that list.

f_1 <- function(x) {

  answer =  x[[1]]   x[[2]]   x[[3]]
 
  return(answer)

}

f_1(x1)

# [1] 6

In other words, you are referring to specific values in the list. So, if we use f_1(x1), then in the function:

x[[1]] = 1
x[[2]] = 2
x[[3]] = 3

So, the reason why f_1(c(5,1,2)) works is because you have given a vector with length of 3, which fills all 3 positions in your function. But if you run f_1(c(5,1)), then you would get an error message, because you would not have a vector long enough for the function.

Error in x[[3]] : subscript out of bounds

If I wanted to add the 1st element from one list to the 2nd element in another list to a 3rd element in another list, then we could do the following:

f_2 <- function(x1, x2, x3) {

  answer =  x1[[1]]   x2[[2]]   x3[[3]]
 
  return(answer)

}

f_2(x1, x2, x3)

# [1] 14

So, in this example, we are pulling 3 numbers from the 3 different lists (i.e., x1, x2, x3).

x1[[1]] = 1
x2[[2]] = 4
x3[[3]] = 9

Now, let's look at single and double brackets.

r <- list(c(1:8), foo = c(5:8), bar = c(4:7))

r

# [[1]]
# [1] 1 2 3 4 5 6 7 8

# $foo
# [1] 5 6 7 8

# $bar
# [1] 4 5 6 7

Now, the use of a single bracket will return a list. So, if we give index 1, then we return a list, which includes a vector of 8 elements.

r[1]

# [[1]]
# [1] 1 2 3 4 5 6 7 8

So, if we wanted to return the second and third elements in the list, then we could give index 2 and 3.

r[c(2,3)]

# $foo
# [1] 5 6 7 8

# $bar
# [1] 4 5 6 7

Another option is to also give the name of the list in brackets:

r["foo"]

# $foo
# [1] 5 6 7 8

For lists, if you use [[, then it will select any single element (which could be another list). On the other hand, [ will return a list of the selected elements. Essentially, [[ operator will extract an element from a list; whereas, [ gets a subset of a list.

> r[1]
[[1]]
[1] 1 2 3 4 5 6 7 8

> class(r[1])
[1] "list"

So above, the use of the single bracket returns a list; however, if we use a double bracket, then it will return a numeric vector.

> r[[1]]
[1] 1 2 3 4 5 6 7 8

> class(r[[1]])
[1] "integer"

Further, you can use multiple brackets to access elements of nested lists.

r_list <- list(list(c(1:8), foo = c(5:8), bar = c(4:7)), 
          list(c(11:19), foo = c(1:10), far = c(9:14)))

So, if we wanted to access the 14 in far in the second list, then we could do the following:

> r_list[[2]][[3]][6]
[1] 14

# Or remember you can use the name for named elements.
> r_list[[2]][["far"]][6]
[1] 14

You can also use brackets along with $ operator.

df <- structure(list(x1 = c(1, 2, 3, 4), x2 = c(2, 4, 6, 8), x3 = c(3, 
6, 9, 12)), class = "data.frame", row.names = c(NA, -4L))

> df
  x1 x2 x3
1  1  2  3
2  2  4  6
3  3  6  9
4  4  8 12

So, if we wanted to access the 12, then we can first use $ followed by [. So, we first subset the column using $, which only returns the values from `x3. Then, we access the fourth element in that vector.

df$x3[4]

[1] 12

# This would be the same as using brackets to get the third column, then get the fourth element.

df[,3][4]
  • Related