Home > Mobile >  Labelling x- and y-axis by the corresponding variable's name in R
Labelling x- and y-axis by the corresponding variable's name in R

Time:12-04

this sounds like a very trivial question at first, but no one managed to help me thus far, hence I'm reaching out to you all.

I'd like to do the following:

I'm writing a simple function that allows me to plot two variables against each other, with a third variable coloring the observation points (depending on the corresponding value of the color variable). The code looks like that:

scatterplot <- function(data_used, x.variable, y.variable, color.variable) {
  
  ggplot(data_used, aes(x=x.variable, y = y.variable))  
    geom_point(aes_string(color = color.variable)) 
  
}

scatterplot(data_used = example_data, x.variable = example_data$education, 
            y.variable = example_data$wages, 
            color.variable = example_data$sex)

What I would like R to do now is to label the x- and y-axis (respectively) by the corresponding variable's name that I decide to be plotted. In this example here, x-axis would be 'education', y-axis would be 'wages'.

I tried to simply put labs (x = x.variable, y = y.variable) and it doesn't work (when doing that, R labels the axes by the variable values!). By default, R just names the axes "x.variable" and "y.variable".

Can someone help me achieve what I'm trying to do?

Best regards,

xifrix

CodePudding user response:

jpenzer's answer is a good one. Here it is without the quasi-quotation stuff.

scatterplot <- function(data_used, x.variable, y.variable, color.variable) {
  
  ggplot(data_used, aes_string(x=x.variable, y = y.variable))  
    geom_point(aes_string(color = color.variable))  
    labs(x=x.variable, y=y.variable, colour=color.variable)
  
}
mtcars %>% 
  mutate(am = as.factor(am)) %>% 
scatterplot(., x.variable = "hp", 
            y.variable = "mpg", 
            color.variable = "am")

enter image description here

CodePudding user response:

I'm not sure the quasi-quotation stuff is 100% necessary in hindsight, but this is the pattern I use for similar needs:

my_scatterplot <- function(data, x, y){
  .x = rlang::enquo(x)
  .y = rlang::enquo(y)
  data %>%
    ggplot(aes(x = x, y = y)) 
    geom_point() 
    labs(x = .x, 
         y = .y)
}

Let me know if it doesn't work for you, it should though. edit: Should add after DaveArmstrong's answer, the function would be called without quotes for the x / y variable e.g.

diamonds %>% my_scatterplot(price, table)

CodePudding user response:

To pass a column name in the function you could use double curly braces {{...}} around the desired column name in the function body:

library(dplyr)
library(ggplot2)

scatterplot <- function(data_used, x.variable, y.variable, color.variable) {
  
  ggplot(data_used, aes_string({{x.variable}}, {{y.variable}}))  
    geom_point(aes_string(color = {{color.variable}}))  
    labs(x=x.variable, y=y.variable, colour=color.variable)
  
}


scatterplot(mtcars %>% mutate(am = as.factor(am)), x.variable = "mpg", 
            y.variable = "hp", 
            color.variable = "am")

enter image description here

  • Related