In R, what does parentheses followed by parentheses mean-CodePudding

The syntax for using scales::label_percent() in a mutate function is unusual because it uses double parentheses:

label_percent()(an_equation_goes_here)

I don't think I have seen ()() syntax in R before and I don't know how to look it up because I don't know what it is called. I tried ?`()()` and ??`()()` and neither helped. What is double parentheses syntax called? Can someone recommend a place to read about it?

Here is an example for context:

library(tidyverse)
members <- 
  read_csv(
    paste0(
      "https://raw.githubusercontent.com/rfordatascience/tidytuesday/", 
      "master/data/2020/2020-09-22/members.csv"
    ), 
    show_col_types = FALSE)


members %>%
  count(success, died) %>%
  group_by(success) %>%
  # old syntax:
  # mutate(percent = scales::percent(n / sum(n))) 
  # new syntax:
  mutate(percent = scales::label_percent()(n / sum(n))) 
#> # A tibble: 4 × 4
#> # Groups:   success [2]
#>   success died      n percent
#>   <lgl>   <lgl> <int> <chr>  
#> 1 FALSE   FALSE 46452 98%    
#> 2 FALSE   TRUE    868 2%     
#> 3 TRUE    FALSE 28961 99%    
#> 4 TRUE    TRUE    238 1%

^{Created on 2023-01-01 with reprex v2.0.2}

CodePudding user response：

Most functions return a value, whether something atomic (numeric, integer, character), list-like (including data.frame), or something more complex. For those, the single set of ()s (as you recognize) are for the one call.

Occasionally, however, a function call returns a function. For example, if we look at ?scales::label_percent, we can scroll down to

Value:

     All 'label_()' functions return a "labelling" function, i.e. a
     function that takes a vector 'x' and returns a character vector of
     'length(x)' giving a label for each input value.

Let's look at it step-by-step:

fun <- scales::label_percent()
fun
# function (x) 
# {
#     number(x, accuracy = accuracy, scale = scale, prefix = prefix, 
#         suffix = suffix, big.mark = big.mark, decimal.mark = decimal.mark, 
#         style_positive = style_positive, style_negative = style_negative, 
#         scale_cut = scale_cut, trim = trim, ...)
# }
# <bytecode: 0x00000168ee5440e8>
# <environment: 0x00000168ee5501b8>
fun(0.35)
# [1] "35%"

The first call to scales::label_percent() returned a function. We can then use that function with as many arguments as we want.

If you don't want to store the returned function in a variable like fun, you can use it immediately by following the first set of ()s with another set of parens.

scales::label_percent()(0.35)
# [1] "35%"

A related question is "why would you want a function to return another function?" There are many stylistic reasons, but in the case of scales::label_*, they are designed to be used in places where the option needs to be expressed as a function, not as a static value. For example, it can be used in ggplot code: axis ticks are often placed conveniently with simple heuristics to determine the count, locations, and rendering of the ticks marks. While one can use ggplot2::scale_*_manual(values = ...) to manually control how many, where, and what they look like, it is often more convenient to not care a priori how many or where, and in cases where faceting is used, it can vary per faceting variable(s), so not something one can easily assign in a static variable. In those cases, it is often better to assign a function that is given some simple parameters (such as the min/max of the axis), and the function returns something meaningful.

Why can't we just pass it scales::label_percent? (Good question.) Even though you're using the default values in your call here, one might want to change any or all of the controllable things, such as:

suffix= defaults to "%", but perhaps you want a space as in " %"?
decimal.mark= defaults to ".", but maybe your locale prefers commas?

While it is feasible to have multiple functions for all of the combinations of these options, it is generally easier in the long run to provide a "template function" for creating the function, such as

fun <- scales::label_percent(accuracy = 0.01, suffix = " %", decimal.mark = ",")
fun(0.353)
# [1] "35,30 %"
scales::label_percent(accuracy = 0.01, suffix = " %", decimal.mark = ",")(0.353)
# [1] "35,30 %"

CodePudding user response：

An Expression followed by an argument list in round parentheses (( / )) is called a Function Call in R.

There's no need to have a special name for two function calls in a row. They're still just function calls.

CodePudding user response：

If we run a function and the value returned by the function is itself a function then we could call one that too.

For example, we first run f using f() assigning the return value to g but the return value is itself a function so g is a function -- it is the function function() 3 -- and we can run that too.

# f is a function which returns a function
f <- function() function() 3  

g <- f()  # this runs f which returns `function() 3`
g()  # thus g is a function so we can call it
## [1] 3

Now putting that all together we can write it in one line as

f()()
## [1] 3

As seen there is only one meaning for () and the fact that there were two together was simply because we were calling the result of a call.

CodePudding user response：

from ?scales::label_percent(),

All label_() functions return a "labelling" function, i.e. a function that takes a vector x and returns a character vector of length(x) giving a label for each input value.

So in simple terms, the second parenthesis () is for the returned labelling function.

A simple example to show this,

library(scales)

french_percent <- label_percent(
  decimal.mark = ",",
  suffix = " %"
)

french_percent(0.01)
#> [1] "1 %"

^{Created on 2023-01-01 with reprex v2.0.2}