I'm stuck on a small issue I've been trying to overcome regarding creating a custom function in R, using magrittr pipes. Basically I've been trying to learn how to make functions which work when passed through a pipe. While the function works to summarize the data of the original dataset it won't accept modifications done through a previous command. Example down below:
TestData <- runif(1000, 1, 100)
TestID <- 1:1000
data01 <- data.frame(TestID, TestData) # Generate data to test the command on
custom_summary_cont2 <- function(DAT, var) {
DAT %>%
summarise(
mean = mean(var),
median = median(var),
sd = sd(var),
quant25 = unname(quantile(var, probs = 0.25)),
quant75 = unname(quantile(var, probs = 0.75)),
min = min(var),
max = max(var)
)
} # The custom function
Now running this code either as:
summary <- custom_summary_cont2(data01, TestData)
or
summary <- data01 %>%
custom_summary_cont2(TestData)
both produce the results I'm interested in, however the complication occurs when I try to pass the custom function when I've applied a previous function. For example:
summary <- data01 %>%
filter(TestData >50) %>%
custom_summary_cont2(TestData)
Now this code returns the same result as the code as if I did not have the filter function, how would I edit the function to make it use the results from filter command?
P.S if this is a really stupid question I'd love a recommendation for a good book that goes over these processes.
CodePudding user response:
This is not related to piping.
You are passing the TestData
vector into your function, not the column name.
To refer to the column name you need to force non-standard evaluation.
You can do this by wrapping the column reference in {{ var }}
, i.e.:
custom_summary_cont2 <- function(DAT, var) {
DAT %>%
summarise(
mean = mean({{var}}),
median = median({{var}}),
sd = sd({{var}}),
quant25 = unname(quantile({{var}}, probs = 0.25)),
quant75 = unname(quantile({{var}}, probs = 0.75)),
min = min({{var}}),
max = max({{var}})
)
}
CodePudding user response:
Instead of var
, {{var}}
will give you a result.
custom_summary_cont2 <- function(DAT, var) {
DAT %>%
summarise(
mean = mean({{var}}),
median = median({{var}}),
sd = sd({{var}}),
quant25 = unname(quantile({{var}}, probs = 0.25)),
quant75 = unname(quantile({{var}}, probs = 0.75)),
min = min({{var}}),
max = max({{var}})
)
}
data01 %>%
filter(TestData >50) %>%
custom_summary_cont2(TestData)
mean median sd quant25 quant75 min max
1 74.95416 74.38507 14.66733 62.43108 87.26725 50.08382 99.92935
data01 %>%
filter(TestData >50) %>%
summarise(
mean = mean(TestData),
median = median(TestData),
sd = sd(TestData),
quant25 = unname(quantile(TestData, probs = 0.25)),
quant75 = unname(quantile(TestData, probs = 0.75)),
min = min(TestData),
max = max(TestData)
)
mean median sd quant25 quant75 min max
1 74.95416 74.38507 14.66733 62.43108 87.26725 50.08382 99.92935