Home > Enterprise >  Make a pipeable function
Make a pipeable function

Time:10-05

I'm stuck on a small issue I've been trying to overcome regarding creating a custom function in R, using magrittr pipes. Basically I've been trying to learn how to make functions which work when passed through a pipe. While the function works to summarize the data of the original dataset it won't accept modifications done through a previous command. Example down below:

TestData <- runif(1000, 1, 100)
TestID <- 1:1000
data01 <- data.frame(TestID, TestData) # Generate data to test the command on

custom_summary_cont2 <- function(DAT, var) {
  
  DAT %>%
   summarise(
     mean = mean(var),
     median = median(var),
     sd = sd(var),
     quant25 = unname(quantile(var, probs = 0.25)),
     quant75 = unname(quantile(var, probs = 0.75)),
     min = min(var),
     max = max(var)
    )
} # The custom function

Now running this code either as:

summary <- custom_summary_cont2(data01, TestData)

or

summary <- data01 %>%
custom_summary_cont2(TestData)

both produce the results I'm interested in, however the complication occurs when I try to pass the custom function when I've applied a previous function. For example:

summary <- data01 %>%
filter(TestData >50) %>%
custom_summary_cont2(TestData)

Now this code returns the same result as the code as if I did not have the filter function, how would I edit the function to make it use the results from filter command?

P.S if this is a really stupid question I'd love a recommendation for a good book that goes over these processes.

CodePudding user response:

This is not related to piping.

You are passing the TestData vector into your function, not the column name.

To refer to the column name you need to force non-standard evaluation.

You can do this by wrapping the column reference in {{ var }}, i.e.:

custom_summary_cont2 <- function(DAT, var) {
  DAT %>%
    summarise(
      mean = mean({{var}}),
      median = median({{var}}),
      sd = sd({{var}}),
      quant25 = unname(quantile({{var}}, probs = 0.25)),
      quant75 = unname(quantile({{var}}, probs = 0.75)),
      min = min({{var}}),
      max = max({{var}})
    )
}

CodePudding user response:

Instead of var, {{var}} will give you a result.

custom_summary_cont2 <- function(DAT, var) {
  
  DAT %>%
    summarise(
      mean = mean({{var}}),
      median = median({{var}}),
      sd = sd({{var}}),
      quant25 = unname(quantile({{var}}, probs = 0.25)),
      quant75 = unname(quantile({{var}}, probs = 0.75)),
      min = min({{var}}),
      max = max({{var}})
    )
}

data01 %>%
  filter(TestData >50) %>%
  custom_summary_cont2(TestData)

      mean   median       sd  quant25  quant75      min      max
1 74.95416 74.38507 14.66733 62.43108 87.26725 50.08382 99.92935

  data01 %>%
    filter(TestData >50) %>%
    summarise(
      mean = mean(TestData),
      median = median(TestData),
      sd = sd(TestData),
      quant25 = unname(quantile(TestData, probs = 0.25)),
      quant75 = unname(quantile(TestData, probs = 0.75)),
      min = min(TestData),
      max = max(TestData)
    )
      mean   median       sd  quant25  quant75      min      max
1 74.95416 74.38507 14.66733 62.43108 87.26725 50.08382 99.92935
  • Related