this sounds like a very trivial question at first, but no one managed to help me thus far, hence I'm reaching out to you all.
I'd like to do the following:
I'm writing a simple function that allows me to plot two variables against each other, with a third variable coloring the observation points (depending on the corresponding value of the color variable). The code looks like that:
scatterplot <- function(data_used, x.variable, y.variable, color.variable) {
ggplot(data_used, aes(x=x.variable, y = y.variable))
geom_point(aes_string(color = color.variable))
}
scatterplot(data_used = example_data, x.variable = example_data$education,
y.variable = example_data$wages,
color.variable = example_data$sex)
What I would like R to do now is to label the x- and y-axis (respectively) by the corresponding variable's name that I decide to be plotted. In this example here, x-axis would be 'education', y-axis would be 'wages'.
I tried to simply put labs (x = x.variable, y = y.variable)
and it doesn't work (when doing that, R labels the axes by the variable values!). By default, R just names the axes "x.variable" and "y.variable".
Can someone help me achieve what I'm trying to do?
Best regards,
xifrix
CodePudding user response:
jpenzer's answer is a good one. Here it is without the quasi-quotation stuff.
scatterplot <- function(data_used, x.variable, y.variable, color.variable) {
ggplot(data_used, aes_string(x=x.variable, y = y.variable))
geom_point(aes_string(color = color.variable))
labs(x=x.variable, y=y.variable, colour=color.variable)
}
mtcars %>%
mutate(am = as.factor(am)) %>%
scatterplot(., x.variable = "hp",
y.variable = "mpg",
color.variable = "am")
CodePudding user response:
I'm not sure the quasi-quotation stuff is 100% necessary in hindsight, but this is the pattern I use for similar needs:
my_scatterplot <- function(data, x, y){
.x = rlang::enquo(x)
.y = rlang::enquo(y)
data %>%
ggplot(aes(x = x, y = y))
geom_point()
labs(x = .x,
y = .y)
}
Let me know if it doesn't work for you, it should though. edit: Should add after DaveArmstrong's answer, the function would be called without quotes for the x / y variable e.g.
diamonds %>% my_scatterplot(price, table)
CodePudding user response:
To pass a column name in the function you could use double curly braces {{...}}
around the desired column name in the function body:
library(dplyr)
library(ggplot2)
scatterplot <- function(data_used, x.variable, y.variable, color.variable) {
ggplot(data_used, aes_string({{x.variable}}, {{y.variable}}))
geom_point(aes_string(color = {{color.variable}}))
labs(x=x.variable, y=y.variable, colour=color.variable)
}
scatterplot(mtcars %>% mutate(am = as.factor(am)), x.variable = "mpg",
y.variable = "hp",
color.variable = "am")