Home > Net >  How can I reorder bars in ggplot according to variable using lapply
How can I reorder bars in ggplot according to variable using lapply

Time:08-05

I would like to produce several figures using ggplot and eventually save them to a single pdf. I have the StarWars dataset as a sample below. I have tried the following code and am able to produce the figures for each variable, but I would like to reorder the bars by the y value for each figure. I also get warning messages when I run the code, which I have shown below.

starwars <- as_tibble(starwars)

#Define the y variables I want to plot
variables <- c("height", "mass", "birth_year")

#Define my function to plot
plot_function <- function(var_names) {
  
  ggplot(starwars, aes_string(x ="name", y = var_names))  
    geom_bar(stat = "identity")  
    theme_mine()
}


#Run function across variables of interest
p <- lapply(variables, plot_function)


#Save figures to pdf
ggsave(
  filename = "StarWarsTest.pdf", 
  plot = marrangeGrob(p, nrow=5, ncol=3), 
  width = 8.5, height = 11
)

My Warning messages:

> p <- lapply(variables, plot_function)
Warning messages:
1: In xtfrm.data.frame(x) : cannot xtfrm data frames
2: In xtfrm.data.frame(x) : cannot xtfrm data frames
3: In xtfrm.data.frame(x) : cannot xtfrm data frames

> ggsave(
    filename = "StarWarsTest.pdf", 
    plot = marrangeGrob(p, nrow=5, ncol=3), 
    width = 8.5, height = 11
  )
Warning messages:
1: Removed 6 rows containing missing values (position_stack). 
2: Removed 28 rows containing missing values (position_stack). 
3: Removed 44 rows containing missing values (position_stack). 

In addition to the above attempt, I have also used x = reorder(name, var_names) within the aes() of ggplot to order the bars without luck.

I would like each figure to look like what it produced with this code:

ggplot(starwars, aes(x = reorder(name, height), y = height))  
  geom_bar(stat = "identity")

Sorry for the long code, any help would be appreciated.

CodePudding user response:

I've taken the liberty of converting your code to use aes() rather than aes_string().

To remove the warnings about missing values, we need to filter the starwars tibble to remove the corresponding rows before plotting. Because lapply doesn't use NSE, we need to quote the column names before passing them to our plotting function. This is admittedly awkward, and I'd be interested in seeing a cleaner alternative.

That said, rewute your plot_function as

plot_function <- function(y) {
  qY = enquo(y)
  starwars %>% 
    filter(!is.na(!! qY)) %>% 
    ggplot(aes(x=reorder(name, -(!! qY)), y=!! qY)) 
      geom_bar(stat="identity")  
      labs(x="name")
}

Then use lapply

p <- lapply(c(quote(height), quote(mass), quote(birth_year)), plot_function)

to save the reordered plots to a list without warning or error messages.

Here's one element of the list:

enter image description here

The use of enquo() and the bang-bang operator (!!) inside the function is one way of dealing with the tidyverse's NSE inside a function. There are others. For details, see here.

CodePudding user response:

I took the liberty to make some changes into your code:

  1. use aes() instead of aes_string()
  2. converted from tibble to dataframe.
  3. change function

The approach I can provide is as follows:

#Convert to dataframe
stars<-as.data.frame(starwars)

Now, in this case, you want the numeric variables and the names. So, I selected only the relevant columns.

#Select relevant columns
stars<-cbind(stars["name"],Filter(is.numeric,stars))

#Remove missing values
stars<- na.omit(stars)

Lets define the function to plot:

#Define my function to plot
plot_function <- function(data_vars) {
 ggplot(stars, aes(x=reorder(name,data_vars),y=data_vars))  
 geom_bar(stat = "identity") 
 coord_flip()
}

Now, this is the important part. To run the function across to variables of interest, we do not provide a character vector with the column names. Instead, we "feed" the lapply(), with the dataframe columns of the variables of interest:

 #Run function across variables of interest
  p <- lapply(Filter(is.numeric,stars), plot_function)

And you should get the output you want.

  • Related